Methods of screening for cell proliferation or neoplastic disorders

ABSTRACT

The invention relates to methods and compositions for identifying subjects having, or predisposed to having, a neoplastic or cell proliferation or neoplastic disorder. The methods are applicable to any type of tissue sample and can be conducted on otherwise normal tissue.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional application of U.S. application Ser.No. 11/145,331 filed Jun. 3, 2005, now issued as U.S. Pat. No.9,086,403, which claims the benefit under 35 USC §119(e) to U.S.Application Ser. No. 60/656,470 filed Feb. 24, 2005; U.S. ApplicationSer. No. 60/646,296 filed Jan. 24, 2005; and U.S. Application Ser. No.60/576,566 filed Jun. 3, 2004, all now expired. The disclosure of eachof the prior applications is considered part of and is incorporated byreference in the disclosure of this application.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under Grant Nos.R01CA65145 and K08CA106610 awarded by the National Institutes of Health.The government has certain rights in the inventions.

BACKGROUND INFORMATION

1. Field of the Invention

This invention relates generally to screening for risk or presence ofneoplastic disorders, and more particularly to screening for biomarkerspresent in a biological sample obtained from a subject that areindicative of a predisposition for a neoplastic (e.g., benign ormalignant) or cell proliferative disorder.

2. Background Information

Each mammalian cell carries two copies of each gene, one inherited fromthe mother (on the maternal chromosome) and one inherited from thefather (on the paternal chromosome). Most of the autosomal genes andX-linked genes in females are therefore biallelic, i.e., both paternaland maternal alleles of the gene are expressed and the information ofboth copies is actively used in protein synthesis. However, in humansand other mammals, monoallelic expression of biallelic genes has beendemonstrated. Allelic exclusion can result from two differentmechanisms. The first mechanism is independent of the parental origin.The second mechanism, called genomic imprinting, is an epigeneticmodification of a specific parental chromosome in the gamete or zygotethat leads to monoallelic or differential expression of the two allelesof a gene in somatic cells of the offspring. Imprinting affects variousessential cellular and developmental processes, including intercellularsignaling, RNA processing, cell cycle control, and promotion orinhibition of cellular division and growth.

Imprinted genes can show monoallelic expression in some tissues andbiallelic expression in others. For example, the insulin-like growthfactor II gene (IGF2) is imprinted in most tissues but is biallelic inbrain and monoallelically expressed in liver. Loss of imprinting (LOI)of the IGF2 gene, or activation of the normally silent maternallyinherited allele, occurs in many common cancers (Feinberg, A., Semin.Cancer Biol. 14, 427 (2004)). The term LOI simply means loss ofpreferential parental origin-specific gene expression and can involveeither abnormal expression of the normally silent allele, leading tobiallelic expression, or silencing of the normally expressed allele,leading to epigenetic silencing of the locus. About 10% of thepopulation shows LOI of IGF2, and this molecular trait is associatedwith a personal and/or family history of colorectal neoplasia (Cui etal., Science 299, 1753 (2003); Woodson et al., J. Natl. Cancer. Inst.96, 407 (2004)). Imprinting of IGF2 is regulated by a differentiallymethylated region (DMR) upstream of the nearby untranslated H19 gene.Deletion of the DMR leads to biallelic expression (LOI) of IGF2 in theoffspring when the deletion is maternally inherited (Leighton, et al.,Nature 375, 34 (1995); Ripoche, et al., Genes Dev. 11, 1596 (1997)).Thus, abnormal imprinting in cancer can lead to activation of normallysilent alleles of growth-promoting genes.

Currently, no single biochemical marker, or plurality of biochemicalmarkers, reliably identifies a subject at risk for developing a diseaseassociated with LOI and/or uncontrolled cell proliferation or neoplasticdisorders (e.g., benign and cancer). Thus, there exists a need fordiagnostic methods and compositions that can utilize celldifferentiation information to identify those individuals at risk fordeveloping a cell proliferation or neoplastic disorder. Such informationcan optionally be correlated with abnormal gene expression resultingfrom epigenetic alterations in the genome of a subject. Earlyimplementation of a prophylactic therapy and periodic screening can leadto prevention of such a disorder.

SUMMARY OF THE INVENTION

The present invention is based on the discovery that alterations inratios of differentiated and undifferentiated cell populations can beused as early indicators for the risk of developing cell proliferationor neoplastic disorders. In general, the invention features methods ofdetermining a subject's risk of developing a cell proliferation orneoplastic disorders, such as cancer, by obtaining a biological sample,such as from blood or intestinal tissue, from a subject and determiningthe level of cell differentiation in the same or a different tissue.Optionally, this information can be correlated with an alteration in theexpression of a target gene. An alteration in expression of a targetgene can directly or indirectly result from a loss of imprinting of atarget gene.

In one embodiment, a method of determining predisposition of a subjectto developing a cell proliferation or neoplastic disorder is provided.The method includes determining the ratio of undifferentiated todifferentiated cells in a normal biological sample from the subject. Theratio of undifferentiated to differentiated cells, as compared to areference ratio, is indicative of a predisposition for developing a cellproliferation or neoplastic disorder. Optionally, the method furtherincludes identifying cells displaying abnormal expression of at leastone target gene in the same or different biological sample from thesubject. A target gene includes any gene the expression of which isaffected by loss of imprinting. For example, the expression of the H19gene or IGF2 gene is directly affected by their imprinting status.However, the expression of an IFG2-related gene, such as Igf1R, IRS-1,IRS-2, PI3K, Akt, p70S6 kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1,c-Myc, Shc, Grb2, SOS, Ras, Raf, MEK, Erk, or MAPK gene, is indirectlyaffected by the imprinting status of H19 and/or IGF2. Thus, theexpression of IGF2-related genes can be stimulated by a loss ofimprinting of, for example, the IGF2 gene. In general, methods of theinvention include analyzing the biological sample for a change in theexpression of a target gene that is directly or indirectly associatedwith loss of imprinting, or a polymorphism thereof. Loss of imprintingcan result from, for example, a change in the methylation status of thegene. The change in methylation status can be hypomethylation of, forexample, a differentially methylated region (DMR) of the H19 gene and/ora DMR of the IGF2 gene. Subsequently, a reference ratio can be generatedfrom tissue obtained from a subject that includes cells displayingnormal imprinting of at least one of the H19 gene and the IGF2 gene.

In another embodiment, determining the ratio of undifferentiated todifferentiated cells in the sample includes identifying a biomarkerassociated with a differentiated or undifferentiated cell. The biomarkercan include, but is not limited to, Shh (Sonic hedgehog), Tcf4, Lef1,Twist, EphB2, EphB3, Hes1, Notch1, Hoxa9, Dkk1, Tle6, Tcf3, Bmi1, Kit,Musashi1 (Msi1), Cdx1, Hes5, Oct4, Ki-67, β-catenin, Noggin, BMP4, PTEN(phosphorylated PTEN), Akt (phosphorylated Akt), Villin, AminopeptidaseN (anpep), Sucrase isomaltase (SI), Ephrin-B1 (EfnB1), Cdx2, Crip,Apoa1, Aldh1b1, Calb3, Dgat1, Dgat2, Clu, Hephaestin, Gas1, Ihh (Indianhedgehog), Intrinsic factor B12 receptor, IFABP, or KLF4.

In another embodiment, determining the ratio of undifferentiated todifferentiated cells in the sample can include: a) imaging the sampleusing immunohistochemical identification of biomarker moleculesspecifically associated with a differentiated or undifferentiated cellpopulation; b) imaging the sample using standard microscopy anddistinguishing differentiated from undifferentiated cells usingmorphologic measurements; c) imaging the sample usingimmunohistochemical identification of proliferation antigens and theirdistribution within colonic crypts; or d) imaging the sample usingimmunoflourescent identification of molecules specific to a biomarkerassociated with a differentiated or undifferentiated cell population.Nucleic acid analyses can also be performed, for example, e) measuringRNA levels; f) measuring gene expression; g) whole genome expressionanalyses; or allele specific expression.

In some embodiments, the cells can be epithelial cells obtained from,for example, a rectal “Pap” test (e.g., a scraped sample). Inalternative embodiments, the epithelial cells can be obtained fromintestinal tissue, such as, for example, the colon. In otherembodiments, the cells can be obtained from the lumen of the intestinaltissue. In other embodiments, the cells can be obtained from the cryptsof the lumen. The cell proliferation or neoplastic disorder can beassociated with a solid tumor such as, for example, an adenoma.

The methods of the invention encompass screening tissue from subjectsnot previously known to have a cell proliferation or neoplasticdisorder, such as a neoplasm of the colon. For example, the results ofthe methods provided herein can be correlated with the subject's familygenetic history. In addition, the subject can be subjected to additionaltests including, but not limited to, chest X-rays, colorectalexaminations, endoscopic examination, MRI, CAT scanning, galliumscanning, and barium imaging.

In other embodiments, methods of determining whether a subject ispredisposed to developing a cell proliferation or neoplastic disorderinclude obtaining a biological sample from a subject and contacting thesample with an array of immobilized biomolecules that specificallyinteract with a biomarker indicative of a differentiated orundifferentiated cell. The methods further include obtaining a subjectprofile by detecting a modification of the biomolecules that isindicative of the ratio of differentiated to undifferentiated cells inthe sample. The subject profile can be compared with a reference profilethat includes one or more values, each value representing the level ofbiomarker in a reference sample obtained from one or more referencesubjects displaying normal imprinting of the target gene. In someembodiments, the biomolecules can be proteins, such as antibodies (e.g.,monoclonal antibodies). In other embodiments the biomolecules can beantigens or receptors. Optionally, the method further includesidentifying cells displaying abnormal expression of at least one targetgene in the same or different biological sample from the subject.

In another embodiment, diagnostic kits for detecting a cellproliferation or neoplastic disorder, or a predisposition to a cellproliferation or neoplastic disorder, are provided. Such kits caninclude an array for detecting a biomarker indicative of adifferentiated or undifferentiated cells in a sample obtained from asubject. The array can include a substrate having a plurality ofaddresses, each address having disposed thereon an immobilizedbiomolecule, wherein each biomolecule individually detects a biomarkerindicative of a differentiated or undifferentiated cells. Optionally,the kit can include a means for identifying abnormal imprinting of atleast one target gene in the biological sample.

In another embodiment, methods of determining whether a therapy regimenis effective for preventing or inhibiting a cell proliferation orneoplastic disorder are provided. Such methods include identifying asubject that is predisposed to developing a cell proliferation orneoplastic disorder and administering to the subject a therapy thatinhibits or prevents an increase in the number of undifferentiated cellsin a target tissue of the subject. The methods further includecontacting a biological sample comprising non-neoplastic cells from thesubject with an array of immobilized biomolecules that specificallyinteract with a biomarker indicative of a differentiated orundifferentiated cell and obtaining a subject profile by detecting amodification of the biomolecules, wherein the modification is indicativeof the ratio of differentiated to undifferentiated cells in the sample.The subject profile can be compared with a reference profile thatincludes one or more values, each value representing the level ofbiomarker in a reference sample obtained from one or more referencesubjects displaying normal imprinting of the target gene. Suchtheranostic methods can include providing the determination to acaregiver and altering the therapy based upon the determination.

In other embodiments, methods of preparing an undifferentiated cell areprovided. Such methods include contacting a more committed cell with anagent that causes the more committed cell to dedifferentiate into anundifferentiated cell, wherein the agent affects the imprinting of atleast one of the H19 gene and the IGF2 gene. The committed cells can benormal or cancer cells. In some embodiments, the committed cells aredifferentiated cells.

In other embodiments, methods of producing an altered cell populationcomprising undifferentiated cells capable of being recommitted into moredifferentiated cells, are provided. Such methods include contacting aninitial cell population comprising committed cells with an agent thatmodulates the imprinting status of a target gene in a cell derived fromepithelial tissue, culturing the cells, and identifying the cellsundifferentiated cells or recovering the undifferentiated cells from thealtered cell population. Such cells can be recovered a biomarker asdescribed herein.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. In case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1F depict immunohistochemical analysis of villin and musashi1in 120 day old pLOI(−) and LOI(+) mice.

FIGS. 2A-2H depict a shift to less differentiated colon epithelium in amouse H19 DMR mutation model and in colonoscopy clinic patients withLOI.

FIGS. 3A-3C depict mouse models of H19 deletion and DMR mutation.

FIGS. 4A and 4B depict Igf2 mRNA and protein levels.

FIGS. 5A and 5B depict histomorphology of small intestinal mucosa inLOI(−) mice (FIG. 5A) versus LOI(+) mice (FIG. 5B).

FIGS. 6A-6D depict immunohistochemistry for villin and ephrin-B1 in 42day mice.

FIGS. 7A-7F depict immunohistochemistry for musashi1 and twist in 42 daymice.

FIGS. 8A-8F depict in situ hybridization analysis of Igf2 mRNA levels inmouse gut with mutation in the H19 DMR (142* mouse).

FIGS. 9A-9F depict in situ hybridization analysis of H19 mRNA levels inE16.5 mouse embryos with mutation in the H19 DMR.

FIGS. 10A and 10B depict musashi1 immunostaining of normal colon of acolonoscopy patient without LOI and a patient with LOI.

Other features and advantages of the invention will be apparent from thefollowing detailed description, and from the claims.

DETAILED DESCRIPTION OF THE INVENTION

Methods and compositions for detecting a modification in the ratio ofdifferentiated and undifferentiated cells in a biological sample from asubject are provided. Such modifications may result from epigeneticalterations that 1) shift normal tissue to a more undifferentiatedstate; 2) increase the target cell population for subsequent geneticalterations; or 3) act independently in tumor initiation. Thus, themethods of the invention allow for determining a change in the balanceor ratio of undifferentiated to differentiated cells in the sample.

The present invention has many embodiments and relies on many patents,applications and other references for details known to those of the art.Therefore, when a patent, application, or other reference is cited orrepeated below, it should be understood that it is incorporated byreference in its entirety for all purposes as well as for theproposition that is recited. For example, methods and compositions fordetecting a loss of imprinting (LOI) indicative of an increased risk ofdeveloping cancer are disclosed in U.S. Pat. App. Pub. No. 20040219559(application Ser. No. 10/629,318), U.S. Pat. App. Pub. No. 20040002082(application Ser. No. 10/336,552), and U.S. Pat. App. Pub. No.20010007749 (application Ser. No. 10/759,917), each of which also ishereby incorporated by reference in its entirety for all purposes.

As used in this application, the singular form “a,” “an,” and “the”include plural references unless the context clearly dictates otherwise.For example, the term “an agent” includes a plurality of agents,including mixtures thereof.

The practice of the present invention may employ, unless otherwiseindicated, conventional techniques and descriptions of organicchemistry, polymer technology, molecular biology (including recombinanttechniques), cell biology, biochemistry, and immunology, which arewithin the skill of the art. Such conventional techniques includepolymer array synthesis, hybridization, ligation, and detection ofhybridization using a label. Specific illustrations of suitabletechniques can be had by reference to the example herein below. However,other equivalent conventional procedures can, of course, also be used.Such conventional techniques and descriptions can be found in standardlaboratory manuals such as Genome Analysis: A Laboratory Manual Series(Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A LaboratoryManual, PCR Primer: A Laboratory Manual, and Molecular Cloning: ALaboratory Manual (all from Cold Spring Harbor Laboratory Press),Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait,“Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press,London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry3^(rd) Ed., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002)Biochemistry, 5^(th) Ed., W. H. Freeman Pub., New York, N.Y., all ofwhich are herein incorporated in their entirety by reference for allpurposes.

In addition to their use to identify subjects who are at risk, the newmethods can be used as a routine screen or “pre-screen” for subjectsthat may have a family genetic history of cancer, such as colon canceror pancreatic cancer. The methods can also identify those subjects whoare not currently at risk for developing cancer, thus avoiding the needfor additional testing.

“Genomic imprinting” or “allelic exclusion according to parent oforigin” is a mechanism of gene regulation by which only one of theparental copies of a gene is expressed. Paternal imprinting means thatan allele inherited from the father is not expressed in offspring.Maternal imprinting means that an allele inherited from the mother isnot expressed in offspring. Imprinted genes are the genes for which oneof the parental alleles is repressed whereas the other one istranscribed and expressed. The expression of an imprinted gene may varyin different tissues or at different developmental stages. Imprintedgenes may be expressed in a variety of tissue or cell types such asmuscle, liver, spleen, lung, central nervous system, kidney, testis,ovary, pancreas, placenta, skin, adrenal, parathyroid, bladder, breast,pituitary, intestinal, salivary gland blood cells, lymph node and otherknown in art. For instance, IGF2 imprinting results in repression of thematernally-derived allele in most tissues except brain, adult liver andchondrocytes (Vu T. H. and Hoffman A. R. (1994) Nature, 371:714-717).

Genomic imprinting has been implicated in cell proliferation orneoplastic disorders such as cancer. For example, loss of heterozygosity(LOH) in the childhood Wilms tumor (WT) occurs on chromosome 11.Examination of RNA from Wilms tumor led to a discovery that not one butboth IGF2 alleles were expressed in 70% of Wilms tumors. In addition, in30% of cases, both alleles of H19 were expressed. In contrast,examination of RNA from normal tissue shows normal imprinting with theexpression of one allele of IGF2 and H19. The term for this novelgenetic alteration is loss of imprinting (LOI) which simply means lossof preferential parental origin-specific gene expression and can involveeither abnormal expression of the normally silent allele, leading tobiallelic expression, or silencing of the normally expressed allele,leading to epigenetic silencing of the locus. Thus, abnormal imprintingin cancer can lead to activation of normally silent alleles ofgrowth-promoting genes.

DNA methylation plays a role in the control of genomic imprinting.First, some imprinted genes in mice, such as H19, show parentalorigin-specific, tissue-independent methylation of CpG islands. Thismethylation represents imprinting on the paternal chromosome and is notsecondary to changes in gene expression. Second, knockout mice deficientin DNA methyltransferase, and exhibiting widespread genomichypomethylation, do not show allele-specific methylation of the H19 CpGisland and exhibit biallelic expression of H19 and loss of expression ofIGF2. Similar parental origin-specific methylation has also beenobserved for a CpG island in the first intron of the maternallyinherited, expressed allele of the IGF2 receptor gene (IGF2R).Methyltransferase deficient knockout mice show loss of methylation ofIGF2R and epigenetic silencing of the gene.

Widespread alterations in DNA methylation in human tumors werediscovered years ago (Feinberg, A P. (1993) Nature Genet. 4:110-113) andremain the most commonly found alteration in human cancers. Thesealterations are ubiquitous to both benign and malignant neoplasms. Bothdecreased and increased methylation have been found at specific sites intumors, with an overall decrease in quantitative DNA methylation(Feinberg et al. (1988) Cancer Res. 48:1159-1161; Feinberg, A. P. (1988)Prog. Clin. Biol. Res. 79:309-317).

In humans, as in mice, the paternal allele of a CpG island in the H19gene and its promoter is normally methylated, and the maternal allele isunmethylated. Because tumors with LOI of IGF2 showed reduced expressionof H19, the methylation pattern of H19 has been examined in tumors withLOI. In all cases showing LOI of IGF2, the H19 promoter exhibits90%-100% methylation at the sites normally unmethylated on thematernally inherited allele. Thus, the maternal allele has acquired apaternal pattern of methylation, consistent with observed expression ofIGF2 on the same maternally derived chromosome in these tumors. Incontrast, tumors without LOI of IGF2 show no change in the methylationof H19, indicating that these changes are related to abnormal imprintingand not malignancy per se. The same alterations in methylation of thematernal allele of H19 are found in patients with Beckwith-Wiedemannsyndrome (BWS) having LOI of IGF2. BWS is a disorder of prenatalovergrowth and cancer, transmitted as an autosomal dominant trait, orarising sporadically.

Another mechanism by which LOI may act involves disruption of animprinting control center on chromosome 11, similar to that recentlydescribed for the BWS/AS region of chromosome 15 (Dittrich et al. (1996)Nat. Genet. 14: 163-170). Thus, disruption of a gene spanning thisregion could cause abnormal imprinting, as well as BWS and/or cancer, atleast when inherited through the germline.

Another mechanism for LOI involves loss of trans-acting factors whichmay establish and maintain a normal pattern of genomic imprinting oncesuch a pattern is established in the germline. Trans-acting modifiers ofimprinting are likely to exist, since imprinting of transgenes is hoststrain-dependent. Such genes may thus act as tumor suppressor genes inhumans and other species.

Yet another mechanism of imprinting that may be disrupted in cancerinvolves histone deacetylation which is linked to X-inactivation inmammals and to telomere silencing in yeast. Genes for both histoneacetylase and histone deacetylase have recently been isolated (Brownellet al. (1996) Cell 84:843-851 Taunton et al. (1996) Science272:408-411). In addition, telomere silencing in yeast also involves theaction of specific genes, e.g., SIR1-SIR4, some of which have homologuesin mammals (Brachmann et al. (1995) Genes Develop. 9:2888-2902).Similarly, some examples of gene silencing in mammals may resembleposition-effect variation in Drosophila, a form of position-dependentepigenetic silencing (Walters et al. (1996) Genes Develop. 10:185-195).Finally, imprinted loci on maternal and paternal chromosomes mayinteract during DNA replication. Chromosomal regions harboring imprintedgenes show replication and timing asynchrony (Kitsberg et al. (1993)Nature 364:459-463). Furthermore, the two parental homologues of someimprinted genes show nonrandom proximity in late S-phase (LaSalle. J. M.and Lalande, M. (1996) Science 272:725-728), indicating a form ofchromosomal cross-talk, as has been observed for epigenetic silencing inDrosophila (Tartoff, K. D. and Henikoff, S. (1991) Cell 65:201-203).

The human IGF2 and H19 genes are normally imprinted, i.e., showpreferential expression of a specific parental allele. Some tumorsundergo loss of imprinting (LOI) in cancer, with one or more of thefollowing: biallelic expression of IGF2, epigenetic silencing of H19;and/or abnormal expression of the paternal H19 allele, and thisobservation has been extended to a wide variety of childhood and adultmalignancies. Normal imprinting can be maintained in part byallele-specific, tissue-independent methylation of H19, since LOI isassociated with abnormal methylation of the normally unmethylatedmaternal H19 allele.

Methods of Identifying at-Risk Subjects

In one embodiment, methods of determining predisposition of a subject todeveloping a cell proliferation or neoplastic disorder are provided. Ingeneral, the subject is a human. The methods include determining theratio of undifferentiated to differentiated cells in a sample obtainedfrom a subject and generating a subject profile. The ratio ofundifferentiated to differentiated cells, as compared to a referenceratio or reference profile, is indicative of a predisposition fordeveloping a cell proliferation or neoplastic disorder. Optionally, themethods include identifying cells displaying abnormal imprinting of atleast one target gene in the normal biological sample from the subject,or cells displaying increased levels of IGF2 gene expression.

“Target gene,” as used herein, includes any genomic sequence theexpression of which is altered, directly or indirectly, by genomicimprinting. A change in genomic imprinting can include loss ofimprinting. For example, the expression of the H19 gene or IGF2 gene isdirectly affected by their imprinting status. However, the expression ofan IFG2-related gene, such as Igf1R, IRS-1, IRS-2, PI3K, Akt, p70S6kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1, c-Myc, Shc, Grb2, SOS, Ras,Raf, MEK, Erk, or MAPK gene, is indirectly affected by the imprintingstatus of H19 and/or IGF2. Thus, the expression of IGF2-related genescan be stimulated by a loss of imprinting of, for example, the IGF2gene. In general, methods of the invention include analyzing thebiological sample for a change in the expression of a target gene thatis directly or indirectly associated with loss of imprinting, or apolymorphism thereof. Loss of imprinting can result from, for example, achange in the methylation status of the gene. The change in methylationstatus can be hypomethylation of, for example, a DMR of the H19 geneand/or a DMR of the IGF2 gene. Subsequently, a reference ratio can begenerated from tissue obtained from a subject that includes cellsdisplaying normal imprinting of at least one of the H19 gene and theIGF2 gene.

Methods provided herein may include analyzing the biological sample fora change in methylation of a target gene, or a polymorphism thereof. Thechange in methylation can be hypomethylation of, for example, a DMR ofthe H19 gene and a DMR of the IGF2 gene. However, it is understood thatany change in DNA methylation, histone modification such as, but notlimited to, acetylation, methylation, phosphorylation, or any change inallele-specific gene expression can result in the over or underexpression of a target gene, thereby affecting the differentiationstatus of a cell or group of cells in a tissue. In addition, any changein the expression of genes that are indicators of progenitor cellfraction, such as musashi and twist, is also encompassed by methodsprovided herein.

Methods provided herein may include analyzing genomic DNA from a sampleand detecting altered expression of a target gene resulting directly orindirectly from altered loss of imprinting (LOI) of, for example, theIGF2 or the H19 gene. It is understood that LOI can directly orindirectly affect the expression of a target gene. For example, theexpression of IGF2-related genes, such as Igf1R, IRS-1, IRS-2, PI3K,Akt, p70S6 kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1, c-Myc, Shc, Grb2,SOS, Ras, Raf, MEK, Erk, and MAPK, is affected by the imprinting statusof IGF2. Exemplary methods of detecting DNA methylation include Southernblotting, bisulfite sequencing, methylation-specific PCR (MSP),real-time MSP, In situ MSP, immunofluorescent staining, and HPLC.Exemplary methods of detecting histone modification include ChIPanalysis. Exemplary methods for detecting mRNA include real-time RT-PCR,northern blotting and In situ hybridization. Exemplary methods fordetecting protein include immunohistochemical staining,immunofluorescent staining and western blotting.

Methods provided herein may further include generating a ratio or“subject profile” from tissue obtained from a subject that includescells displaying normal expression of a target gene such as, forexample, the H19 gene and/or the IGF2 gene. A “subject profile,” as usedherein, simply means identifying the ratio of undifferentiated anddifferentiated cells in a given sample from test subject. A ratio can begenerated from a sample taken from, for example, intestinal tissue. Thesubject profile can be expressed as an array “signature” or “pattern” ofspecific identifiable biomarkers that distinguish undifferentiated cellsfrom differentiated cells. The array signature can be color-coded as infor easy visual or computer-aided identification. The signature can alsobe described as a number(s) that correspond to values attributed to thebiomarkers identified by the array. “Array analysis,” as used herein, isthe process of extrapolating information from an array using statisticalcalculations such as factor analysis or principle component analysis(PCA). In addition to being expressed as a signature, a reference ratiocan be expressed as a “threshold” value or series of threshold values.For example, a single threshold value can be determined for the level ofa particular biomarker, or series of biomarkers, in a particular sample.A threshold value can have a single value or a plurality of values, eachvalue representing a level of a specific biomarker, or specific seriesof biomarkers that are indicative of the presence of differentiated orundifferentiated cells.

The ratio constituting the subject profile can be compared to a“reference ratio” or “reference profile.” In general, reference profilesare generated from a series of different subjects and tissues. Thereference profile is used as a baseline for determining whether theratio provided in the subject profile is normal or abnormal for thesubject and/or type of tissue being tested. “Subject profiles” and“reference profiles” are discussed below.

The biological sample used to generate a reference ratio can be same ordifferent from the sample used to identify abnormal imprinting of atarget gene. As used herein, biological sample includes any tissuesample, such as intestinal tissue, blood, or serum. It is understoodthat the subject from which the sample is obtained need not have a cellproliferation or neoplastic disorder, such as colon cancer or pancreaticcancer, in order for the methods of the invention to be useful. In fact,the invention contemplates the use of normal (i.e., non-neoplastic)tissue in order to identify a subject predisposed to developing a cellproliferation or neoplastic disorder. The biological sample can includeepithelial cells obtained from, for example, a rectal Pap test. Thebiological sample can include cells obtained from intestinal tissue,such as the colon or pancreas. More specifically, the cells can beobtained from the lumen of the intestinal tissue. Such cells can be, forexample, epithelial cells obtained from the crypts of the lumen of theintestinal tissue. A cell proliferation or neoplastic disorder can beassociated with a solid tumor, such as an adenoma. The results of ascreen for a predisposition to developing a cell proliferation orneoplastic disorder can be correlated with the subject's family genetichistory. Subsequently, the subject can undergo additional diagnostictests including chest X-rays, colorectal examinations, endoscopicexamination, MRI, CAT scanning, gallium scanning, and barium imaging.

In other embodiments, cell differentiation can be determined by moreconventional means, such as microscopy and immunohistochemicalidentification. For example, a sample can be imaged usingimmunohistochemical identification of biomarkers specifically associatedwith a differentiated or undifferentiated cell population. In addition,standard microscopy can be used distinguish differentiated fromundifferentiated cells using morphologic measurements. Further,immunohistochemical identification of proliferation antigens and theirdistribution within, for example, colonic crypts, can be used todistinguish differentiated from undifferentiated cells. Finally, thesample can be imaged using immunoflourescent identification of moleculesspecific to a biomarker associated with a differentiated orundifferentiated cell population.

In addition, “normal” (i.e., non-cancerous) tissue obtained from asubject can be examined for other characteristics indicative of apredisposition for developing a cell proliferation or neoplasticdisorder such as cancer. Such characteristics can include changes in theexpression of genes or expression of proteins that are associated withspecific niches (and size of the niche) or compartment of a particulartissue. Also included are changes in distribution of cells within nichesor compartments from normal tissue. Also included are changes in thedistribution and number of progenitor cells in such tissues. Alsoincluded are increases in the number of stem cells and/or precursorcells for cancer in the tissue. Further, an increase in the number ofcells showing cancer-like features can be used as an indicator ofincreased risk of developing cancer. Similarly, an alteration in thematuration of the otherwise normal cells can be indicative of a cellproliferation or neoplastic disorder.

A “biomarker” can be a molecule that distinguishes differentiated fromundifferentiated cells. Such biomarkers include, but are not limited to,Shh (Sonic hedgehog), Tcf4, Lef1, Twist, EphB2, EphB3, Hes1, Notch1,Hoxa9, Dkk1, Tle6, Tcf3, Bmi1, Kit, Musashi1 (Msi1), Cdx1, Hes5, Oct4,Ki-67, β-catenin, Noggin, BMP4, PTEN (phosphorylated PTEN), and Akt(phosphorylated Akt) for identifying undifferentiated cells. Biomarkersuseful for identifying differentiated cells include, but are not limitedto, Villin, Aminopeptidase N (anpep), Sucrase isomaltase (SI), Ephrin-B1(EfnB1), Cdx2, Crip, Apoa1, Aldh1b1, Calb3, Dgat1, Dgat2, Clu,Hephaestin, Gas1, Ihh (Indian hedgehog), intrinsic factor B12 receptor,IFABP, and KLF4. A biomarker can further encompass oligosaccharides,polysaccharides, oligopeptides, proteins, oligonucleotides, andpolynucleotides. Oligonucleotides and polynucleotides include, forexample, DNA and RNA, e.g., in the form of aptamers. A biomarker canalso include organic compounds, organometallic compounds, salts oforganic and organometallic compounds, saccharides, amino acids, andnucleotides, lipids, carbohydrates, drugs, steroids, lectins, vitamins,minerals, metabolites, cofactors, and coenzymes.

Various antigens are associated with undifferentiated and differentiatedcells. The term “associated” here means the cells expressing or capableof expressing, or presenting or capable of being induced to present, orcomprising, the respective antigen(s). Each specific antigen associatedwith an undifferentiated cell or a differentiated cell can act as abiomarker. Hence, different types of cells can be distinguished fromeach other on the basis of their associated particular antigen(s) or onthe basis of a particular combination of associated antigens.

The methods provided herein may utilize, in part, various means fordistinguishing less differentiated cells from those that have undergonedifferentiation. Cell differentiation is a process whereby structuresand functions of cells are progressively committed to give rise to morespecialized cells. Therefore, as the cells become more committed, theybecome more specialized. In the majority of mammalian cell types, celldifferentiation is a one-way process leading ultimately to terminallydifferentiated cells. However, although some cell types persistthroughout life without dividing and without being replaced, many celltypes do continue to divide during the lifetime of the organism andundergo renewal. This may be by simple division (e.g., liver cells) or,as in the case of cells such as haemopoietic cells and epidermal cells,by division of relatively undifferentiated stem cells followed bycommitment of one of the daughter cells to a program of subsequentirreversible differentiation. All of these processes, however, have onefeature in common: cells either maintain their state of differentiationor become more differentiated. They do not become undifferentiated oreven less differentiated.

The methods provided herein can also encompass identification of thosecells that may have undergone “dedifferentiation.” Dedifferentiation isa process whereby structures and functions of cells are progressivelychanged to give rise to less specialized cells. Some cells naturallyundergo limited reverse differentiation (dedifferentiation) in vivo inresponse to tissue damage. For example, liver cells have been observedto revert to an enzyme expression pattern similar to the fetal enzymepattern during liver regeneration (Curtin and Snell, 1983, Br. J.Cancer, Vol 48; 495-505). While preserving the entire informationencoded on its genome, cells undergoing retrodifferentiation losemorphological and functional complexity by virtue of a process ofself-deletion of cytoplasmic structures and the transition to a morejuvenile pattern of gene expression. This results in a progressiveuniformization of originally distinct cell phenotypes and to a decreaseof responsiveness to regulatory signals operational in adult cells.

In another embodiment, methods of determining whether a subject ispredisposed to developing a cell proliferation or neoplastic disordermay include identifying a subject comprising cells displaying increasedlevels of, for example, IGF2 gene expression. Subsequently or inparallel, the ratio of undifferentiated to differentiated cells in thesame or different sample from the subject can be determined. Thedetermination of increased levels of, for example, IGF2 gene expressioncan include detection of increased levels of IGF2 mRNA and/or IGF2polypeptide. Methods of detecting mRNA and/or polypeptides in a sampleare well known to those skilled in the art of molecular biology. It isunderstood that increased levels of the target gene expression includesincreased levels of target gene mRNA and/or increased levels of apolypeptide encoded by the target gene, such as H19, IGF2, Igf1R, IRS-1,IRS-2, PI3K, Akt, p70S6 kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1,c-Myc, Shc, Grb2, SOS, Ras, Raf, MEK, Erk, or MAPK gene.

In another embodiment, a method of determining whether a subject ispredisposed to developing a cell proliferation or neoplastic disorder,is provided. The method may include contacting a normal biologicalsample from a subject with an array of immobilized biomolecules thatspecifically interact with a biomarker indicative of a differentiated orundifferentiated cells. The method may further include obtaining asubject profile by detecting a modification of the biomolecules. Themodification of a biomolecule may be indicative of the ratio ofdifferentiated to undifferentiated cells in the sample. “Biomolecules,”as used herein, include proteins, such as monoclonal or polyclonalantibodies. Biomolecules also include antigens or receptors.Modification, as used herein, may include binding Shh, Tcf4, Lef1,Twist, EphB2, EphB3, Hes1, Notch1, Hoxa9, Dkk1, Tle6, Tcf3, Bmi1, Kit,Musashi1 (Msi1), Cdx1, Hes5, Oct4, Ki-67, □-catenin, Noggin, BMP4, PTEN(phosphorylated PTEN), Akt (phosphorylated Akt), Villin, AminopeptidaseN (anpep), Sucrase isomaltase (SI), Ephrin-B1 (EfnB1), Cdx2, Crip,Apoa1, Aldh1b1, Calb3, Dgat1, Dgat2, Clu, Hephaestin, Gas1, Ihh (Indianhedgehog), Intrinsic factor B12 receptor, IFABP, or KLF4 to abiomolecule.

Subsequently, the subject profile may be compared with a referenceprofile that comprises one or more values. Each value can represent thelevel of biomarker in a reference sample obtained from one or morereference subjects that are not predisposed to developing a cellproliferation or neoplastic disorder. The method may further includeidentifying abnormal expression of at least one target gene in the sameor different biological sample obtained from the subject. Exemplarytarget genes include H19, IGF2, Igf1R, IRS-1, IRS-2, PI3K, Akt, p70S6kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1, c-Myc, Shc, Grb2, SOS, Ras,Raf, MEK, Erk, or MAPK gene. The abnormal expression of the target genemay be directly or indirectly related to a loss of imprinting.

The presence or absence of LOI may be detected by examining anycondition, state, or phenomenon which causes LOI or is the result ofLOI. Such conditions, states, and phenomena include, but are not limitedto 1) causes of LOI, such as the state or condition of the cellularmachinery for DNA methylation, the state of the imprinting controlregion on chromosome 11, the presence of trans-acting modifiers ofimprinting, the degree or presence of histone deacetylation; 2) state ofthe genomic DNA associated with the genes or gene for which LOI is beingassessed, such as the degree of DNA methylation; and 3) effects of LOI,such as: a) relative transcription of the two alleles of the genes orgene for which LOI is being assessed; b) post-transcriptional effectsassociated with the differential expression of the two alleles of thegenes or gene for which LOI is being assessed; c) relative translationalof the two alleles of the genes or gene for which LOI is being assessed;d) post-translational effects associated with the differentialexpression of the two alleles of the genes or gene for which LOI isbeing assessed; e) other downstream effects of LOI, such as altered geneexpression measured at the RNA level, at the splicing level, or at theprotein level or post-translational level (i.e., measure one or more ofthese properties of an imprinted gene's manifestation into variousmacromolecules); changes in function that could involve, for example,cell cycle, signal transduction, ion channels, membrane potential, celldivision, or others (i.e., measure the biological consequences of aspecific imprinted gene being normally or not normally imprinted (forexample, QT interval of the heart). Another group of macromolecularchanges could be in associated processes such as histone acetylation,histone deacetylation, or RNA splicing.

When detecting the presence or absence of LOI by relying on any one ofthese conditions, states, or phenomena, it is possible to use a numberof different specific analytical techniques. In particular, it ispossible to use any of the methods for determining the pattern ofimprinting known in the art. It is recognized that the methods may varydepending on the gene to be analyzed.

Conditions, states, and phenomena which may cause LOI and may beexamined to assess the presence or absence of LOI include: the state orcondition of the cellular machinery for DNA methylation, the state ofthe imprinting control region on chromosome 11, the presence oftrans-acting modifiers of imprinting, the degree or presence of histonedeacetylation or histone deacetylation, imprinting control center,transacting modulatory factors, changes in chromatin caused bypolycomb-like proteins, trithorax-like proteins, human homologues ofother chromatin-affecting proteins in other species such as Su(var)proteins in Drosophila, SIR proteins in yeast, mating type silencing inyeast, XIST-like genes in mammals.

It is also possible to detect LOI by examining the DNA associated withthe gene or genes for which the presence or absence of LOI is beingassessed. By the term “the DNA associated with the gene or genes forwhich the presence or absence of LOI is being assessed” it is meant thegene, the DNA near the gene, or the DNA at some distance from the gene(as much as a megabase or more away, i.e., methylation changes can befar away, since they act on chromatin over long distances). Suchapproaches include measuring the degree of methylation in the DNAassociated with the gene or genes for which the presence or absence ofLOI is being assessed. It is also possible to detect LOI by examiningmodifications to DNA-associated protein, such as histone acetylation andhistone deacetylation; changes to binding proteins detected by bandshift, protection assays, or other assays, in addition to changes to theDNA sequence itself.

The degree of methylation in the DNA, associated with the gene or genesfor which the presence or absence of LOI is being assessed, may bemeasured by means of a number of analytical techniques. For example, theDNA, associated with the gene or genes for which the presence or absenceof LOI is being assessed, may be sequenced using conventional DNAsequencing techniques as described in “Current Protocols in MolecularBiology” (Asubel et al., Wiley lnterscience, 1998). In this case, thebiological sample will be any which contains sufficient DNA to permitsequencing.

In addition, the degree of methylation in the DNA, associated with thegene or genes for which the presence or absence of LOI is beingassessed, may be measured by fluorescent in situ hybridization (FISH) bymeans of probes which identify and differentiate between genomic DNAs,associated with the gene for which the presence or absence of LOI isbeing assessed, which exhibit different degrees of DNA methylation. Inthis case, the biological sample will typically be any which containssufficient whole cells or nuclei to perform short term culture. Usually,the sample will be a tissue sample which contains 10 to 10,000,preferably 100 to 10,000, whole somatic cells.

Typically, in methods for assaying allele-specific gene expression whichrely upon the differential transcription of the two alleles, RNA isreverse transcribed with reverse transcriptase, and then PCR isperformed with PCR primers that span a site within an exon where thatsite is polymorphic (i.e., normally variable in the population), andthis analysis is performed on an individual that is heterozygous (i.e.,informative) for the polymorphism. One then uses any of a number ofdetection schemes to determine whether one or both alleles is expressed.See also, Rainier et al. (1993) Nature 362:747-749; which teaches theassessment of allele-specific expression of IGF2 and H19 by reversetranscribing RNA and amplifying cDNA by PCR using new primers thatpermit a single round rather than nested PCR; Matsuoka et al. (1996)Proc. Natl. Acad Sci USA 93:3026-3030 which teaches the identificationof a transcribed polymorphism in p57.sup.KIP2; Thompson et al. (1996)Cancer Research 56:5723-5727 which teaches determination of mRNA levelsby RPA and RT-PCR analysis of allele-specific expression ofp57.sup.KIP2; and Lee et al. (1997) Nature Genet. 15:181185 whichteaches RT-PCR SSCP analysis of two polymorphic sites. Such disclosuresare herein incorporated by reference. In this case, the biologicalsample will be any which contains sufficient RNA to permit amplificationand subsequent reverse transcription followed by polymerase chainreaction. Typically, the biological sample will be a tissue sample whichcontains Ito 10,000,000, preferably 1000 to Ser. No. 10/000,000, morepreferably 1,000,000 to Ser. No. 10/000,000, somatic cells.

It is also possible to utilize allele specific RNA-associated in situhybridization (ASISH) to detect the presence or absence of LOI byrelying upon the differential transcription of the two alleles. InASISH, the relative abundance of transcribed mRNA for two alleles isassessed by means of probes which identify and differentiate between themRNA transcribed from the two alleles. Typically, the probes are taggedwith fluorescent labels which results in a high sensitivity and easilyquantifiable results. ASISH is described in Adam et al. (1996)“Allele-specific in situ hybridization (ASISH) analysis: a noveltechnique which resolves differential allelic usage of H19 within thesame cell lineage during human placental development,” Development122:83-47, which is incorporated herein by reference. In this case, thebiological sample will typically be any which contains sufficient wholecells or nuclei to perform histological section and in situhybridization. Usually, the sample will be a tissue sample whichcontains 10-100,000, preferably 100-1000, whole somatic cells.

Accordingly, it is also possible to detect LOI by examiningallele-specific post-transcriptional effects (i.e., effects aftertranscription and before translation), like alternate splicing thatdepends on which allele was transcribed, and detection of secondarystructure of the RNA.

It is also possible to detect LOI by examining the relative translationof the two alleles of the gene or genes for which the presence orabsence of LOI is being measured. In this case, the presence or relativeabundance of the two polypeptides arising from the expression of the twoalleles is measured directly. This approach can be effected by any knowntechnique for detecting or quantifying the presence of a polypeptide ina biological sample. For example, allele-specific translational effectsmay be examined by quantifying the proteins expressed by the two allelesusing antibodies specific for each allele (transcribed, translatedpolymorphism). Such effects may be measured and/or detected by suchanalytical techniques as Western blotting, or use of an ELISA assay. Inthis case, the biological sample will be any which contains a sufficientamount of the polypeptide(s) encoded by the gene(s) for which thepresence or absence of LOI is being measured.

LOI may also be detected by examining post-translational effects, suchas secondary modifications that are specific to one allele, likeglycosylation or phosphorylation. For example, one allele may bemodified, say by phosphorylation or glycosylation, and the other onenot. Because the polymorphism encodes a recognition motif, then one canreadily distinguish the difference by a Western blot, detectingalternate migration of the polypeptide or protein; use of antibodiesspecific for the modified form; radioactive incorporation of phosphorylgroup or glycosyl group or other modification (i.e., in living cells,followed by the detection of a band at a varying location).

LOI may also be detected by reliance on other allele-specific downstreameffects. For example, depending on the metabolic pathway in which liesthe product of the imprinted gene; the difference will be 2× versus 1×(or some number in between) of the product, and therefore the functionor a variation in function specific to one of the alleles. For example,for IGF2, increased mitogenic signaling at the IGFI receptor, increasedoccupancy of the IGF1 receptor increased activity at the IGF2 catabolicreceptor, decreased apoptosis due to the dose of IGF2; for KvLQT1,change in the length of the QT interval depending on the amount andisoform of protein, or change in electrical potential, or change inactivity when the RNA is extracted and introduced into Xenopus oocytes.

It is also possible to detect LOI by detecting an associated halotype,i.e., linked polymorphisms that identify people whose genes are prone toLOI. Thus, LOI may be detected by relying on a polymorphism, i.e., agenetic difference between the two alleles. However, it will berecognized that many of the techniques described above may be used todetect LOI even when there is no polymorphism in the two alleles of thegene or genes for which the presence or absence of LOI is beingmeasured. For example, LOI may be detected by reliance onallele-specific DNA methylation (polymorphism independent); histoneacetylation; other modifications to DNA; or alterations in replicationtiming, when the imprinted allele shows “replication timing asynchrony”,i.e., the two alleles replicate at different times. When the two allelesreplicate at the same time, LOI may be detected by FISH. Since imprintedalleles pair in the late S phase, LOI may be detected by the absence ofsuch pairing in the late S as observed by FISH.

On the other hand certain techniques are more conveniently used whenthere is a polymorphism in the two alleles of the gene or genes forwhich the presence or absence of LOI is being measured. For example,RT-PCR followed by SSCP (single strand conformational polymorphism)analysis; restriction enzyme digestion analysis followed byelectrophoresis or Southern hybridization; or radioisotopic PCR; PCR;allele-specific oligonucleotide hybridization; direct sequencingmanually or with an automated sequencer; denaturing gradient gelelectrophoresis (DGGE); and many other analytical techniques can be usedto detect LOI when relying on a polymorphism.

The presence or absence of LOI may be determined for any gene or geneswhich are known to normally exhibit imprinting. Currently there areabout 22 genes which are known to be normally imprinted (see Feinberg inThe Genetic Basis of Human Cancer, B Vogelstein & K Kinzler, Eds.,McGraw Hill, 1997, which is incorporated herein by reference). Examplesof such genes include, but are not limited to, IGF2, H19, p57_(KIP2),KvLQT1, TSSC3, TSSCS, and ASCL2. However, it is expected that additionalgenes which normally exhibit imprinting will be discovered in the futureand the LOI of such genes may be the target of the present methods andare therefore included in the present invention.

Direct approaches to identifying novel imprinted genes include, but arenot limited to, positional cloning efforts aimed at identifyingimprinted genes near other known imprinted genes (Barlow et al. (1991)Nature 349:84-87); techniques comparing gene expression inparthenogenetic embryos to that of normal embryos (Kuroiwa et al. (1996)Nat. Genet. 12:186-190) and restriction landmark genome scanning (Nagaiet al. (1995) Biochem. Biophys. Res. Commun. 213:258-265).

The methods described herein encompass the identification of subjectspredisposed to developing a cell proliferation or neoplastic disorder bydetermining the ratio of non-cancerous undifferentiated cells to that ofnon-cancerous differentiated cells in a tissue sample obtained from thesubject. It should be understood that the present methods of assessingthe risk of contracting cancer may include comparing the ratio describedabove against one or more predetermined threshold values, such that, ifthe ratio is below a given threshold value then the subject is assignedto a low risk population for developing a cell proliferation orneoplastic disorder. Alternatively, the analytical technique may bedesigned not to yield an explicit numerical value for the ratio ofnon-cancerous undifferentiated cells to that of non-cancerousdifferentiated cells, but instead yield only a first type of signal whenthe ratio is below a threshold value and/or a second type of signal whenthe ratio is above a threshold value. It is also possible to carry outthe present methods by means of a test in which the ratio is signaled bymeans of a non-numeric spectrum such as a range of colors encounteredwith immunohistochemical analysis of a tissue sample. The presentmethods may optionally include detecting LOI in the tissue.

The present methods of assessing the risk of developing a cellproliferative disorder may suitably be carried out on any subjectselected from the population as a whole. However, it may be preferred tocarry out this method on certain selected groups of the generalpopulation when screening for the predisposition to particular types ofcancer. Preferably, the present method is used to screen selected groupswhich are already known to have an increased risk of contracting theparticular type of cancer in question.

The methods described herein encompass the identification of subjectspredisposed to developing a cell proliferation or neoplastic disorder bydetermining the ratio of non-cancerous undifferentiated cells to that ofnon-cancerous differentiated cells in a tissue sample obtained from thesubject. These methods optionally include detecting LOI in the tissue bydetermining, for example, the degree of methylation of the genomic DNAassociated with particular target gene(s) for which LOI is beingdetected.

Exemplary epigenetic alterations in human cancers include global DNAhypomethylation, gene hypomethylation and promoter hypermethylation, andloss of imprinting (LOI) of the insulin-like growth factor-II gene(IGF2). One mechanism for LOI is hypermethylation of a differentiallymethylated region (DMR) upstream of, for example, the H19 gene, allowingactivation of the normally silent maternal allele of IGF2. Anothermechanism for LOI includes hypomethylation of the H19 DMR as well as theDMR upstream of exon 3 of IGF2 in colorectal cancers. Thishypomethylation has been identified in both colorectal cancers andnormal mucosa from the same patients, and in cell lines with somaticcell knockout of DNA methyltransferases DNMT1 and DNMT3B. Thus,hypermethylation and hypomethylation are mechanisms for LOI. Forexample, hypomethylation of both the IGF2 gene and the H19 gene can becorrelated with loss of imprinting of the IGF2 gene and LOI of IGF2 canbe correlated with the presence and increased risk for developingcancer, e.g., colorectal cancer.

Methods of the present invention may optionally include analyzing LOIof, for example, the IGF2 gene by analyzing hypomethylation of the IGF2gene or H19 gene, to identify an increased risk of developing cancer ina subject. This information may be correlated with celldifferentiation/undifferentiation data obtained from the same subject.The method may include analyzing a biological sample from the subjectfor hypomethylation of a differentially methylated region (DMR) of theH19 gene and/or the IGF2 gene, or a polymorphism and/or fragment of theH19 DMR and/or IGF2 DMR. The H19 DMR, or fragment thereof, may include aCTCF binding site, for example, CTCF binding site 1 or CTCF binding site6.

In certain aspects, the subject is an apparently normal subject.Hypomethylation can be analyzed in a DNA region corresponding to an H19DMR. An IGF2 DMR sequence can correspond to GenBank nucleotides 631-859(accession no. Y13633). One exemplary IGF2 DMR corresponds to position−566 bp to −311 bp relative to exon 3 of IGF2 (i.e., nucleotides 661 to916 of GenBank accession no. Y13633. Another DMR of H19 corresponds tonucleotides 2057 to 8070 of Genbank accession no. AF087017, incorporatedherein by reference in its entirety; which correspond in variant form tonucleotides 3829 to 9842 of AF125183. In certain aspects the methodcomprises analyzing the biological sample for hypomethylation ofpositions within the region of the H19 DMR that are analyzed using thenested primer pairs SEQ ID NOs:21 and 22, followed by SEQ ID NOs:23 and24. Furthermore, in certain aspects, hypomethylation is analyzed in aDNA region corresponding to an IGF2 DMR. In certain aspects the methodcomprises analyzing the biological sample for hypomethylation ofpositions within the region of the IGF2 DMR that are analyzed using thenested primer pairs SEQ ID NOs:1 and 2, followed by SEQ ID NOs:3 and 4,or the region analyzed using primer pairs SEQ ID NOs: 29 and 30,followed by SEQ ID NOs:27 and 28.

Thus, in addition to including devices and reagents for distinguishingdifferentiated from undifferentiated cells, a kit for performing methodsof the invention can further include a plurality of oligonucleotideprobes, primers, or primer pairs, or combinations thereof, capable ofbinding to the DMR of IGF2 or H19 with or without prior bisulfitetreatment of the DMR. The kit can include an oligonucleotide primer pairthat hybridizes under stringent conditions to all or a portion of theDMR only after bisulfite treatment. The kit can include instructions onusing kit components to identify an increased risk of developing cancer.In certain embodiments the instructions are directed at subjects of thegeneral population. The kit for example, includes one or both of aprimer pair corresponding to the primer pair SEQ ID NO:21 and SEQ IDNO:22 and the primer pair SEQ ID NO: 23 and SEQ ID NO:24. In anotheraspect, the kit for example, includes one or both of a primer paircorresponding to the primer pair SEQ ID NO:25 and SEQ ID NO:26, and theprimer pair SEQ ID NO: 27 and SEQ ID NO:28.

Hypomethylation of a DMR is present when there is a measurable decreasein methylation of the DMR. Methods for determining methylation statesare provided herein. For example, the H19 DMR can be determined to behypomethylated when it is methylated at less than 10, less than 5, orless than 3 sites of all of the greater than 25 methylation sites withinthe H19 DMR. Alternatively, as illustrated in the Examples providedherein, hypomethylation of the H19 DMR can be identified when less than50% or less than 75% of the methylation sites analyzed are notmethylated. Methylation state can be analyzed for these DMRs byanalyzing less than all of the methylation sites within the DMR. Incertain aspects, the methylation sites are those sites for IGF2 that arelocated within the fragments amplified by the nested primer pairs SEQ IDNO:1 and SEQ ID NO:2 followed by SEQ ID NO:3 and SEQ ID NO:4, or SEQ IDNO:25 and SEQ ID NO:26 followed by SEQ ID NO:27 and SEQ ID NO:28. ForH19, in certain aspects methylation sites of fragments of the presentinvention are those found within nested primer pairs SEQ ID NO:21 andSEQ ID NO:22 followed by SEQ ID NO:23 and SEQ ID NO:24.

A fragment of the H19 DMR or IGF2 DMR can be the region of the H19 DMRor IGF2 DMR that is amplified and/or flanked by primers that correspondto SEQ ID NOS:1-4 and 5-32. For example, the fragment of the H19 DMR canbe the region of the H19 DMR that is amplified by the primer pairrecited in SEQ ID NOS:21 and 22, or the primer pair recited in SEQ IDNOS:23 and 24, or by the nesting of SEQ ID NOS:21 and 22 followed by SEQID NOS:23 and 24. As another example, the fragment of the IGF2 DMR canbe the region of the IGF2 DMR that is amplified by the primer pairrecited in SEQ ID NOS: 25 and 26, or the primer pair recited in SEQ IDNOS:27 and 28, or by the nesting of SEQ ID NOS:25 and 26 followed by SEQID NOS:27 and 28. As another example, the fragment of the IGF2 DMR canbe the region of the IGF2 DMR that is amplified by the primer pairrecited in SEQ ID NOS: 1 and 2, or the primer pair recited in SEQ IDNOs:3 and 4, or by the nesting of SEQ ID NOS:1 and 2 followed by SEQ IDNOs:3 and 4. The sequences of the exemplary primers are listed below:

(SEQ ID NO: 1) 5′ GGTGAGGATGGGTTTTTGTT 3′ (SEQ ID NO: 2) 5′CTACTCTCCCAACCTCCCTAA 3′ (SEQ ID NO: 3) 5′ ATTGGGGGTGGAGGGTGTAT 3′(SEQ ID NO: 4) 5′ TCTATTACACCCTAAACCCAA 3′ (SEQ ID NO: 5) 5′ATCTTGCTGACCTCACCAAGG 3′ (SEQ ID NO: 6) 5′ CGATACGAAGACGTGGTGTGG 3′(SEQ ID NO: 7) 5′ CCGACTAAGGACAGCCCCCAAA 3′ (SEQ ID NO: 8) 5′TGGAAGTCTCTGCTCTCCTGTC 3′ (SEQ ID NO: 9) 5′-ACAGTGTTCCTGGAGTCTCGCT 3′(SEQ ID NO: 10) 5′ CACTTCCGATTCCACAGCTACA 3′ (SEQ ID NO: 11) 5′ACAGGGTCTCTGGCAGGCTCAA 3′ (SEQ ID NO: 12) 5′ ATGAGTGTCCTATTCCCAGATG 3′(SEQ ID NO: 13) 5′ AACTGGGGTTCGCCCGTGGAA 3′ (SEQ ID NO: 14) 5′CAAATTCACCTCTCCACGTGC 3′ (SEQ ID NO: 15) 5′ GATCCTGATGGGGTTAGGATGT 3′(SEQ ID NO: 16) 5′ GGAATTTCCATGGCATGAAAAT 3′ (SEQ ID NO: 17) 5′GGTCTGCCTTGGTCTCCTAACT 3′ (SEQ ID NO: 18) 5′ GGCCACTTTCCTGTCTGAAGAC 3′(SEQ ID NO: 19) 5′ CAGTCTCCACTCCACTCCCAAC 3′ (SEQ ID NO: 20) 5′GACCTCTCCCTCCCAGACCACT 3′ (SEQ ID NO: 21)5′-GAGTTTGGGGGTTTTTGTATAGTAT-3′ (SEQ ID NO: 22) 5′CTTAAATCCCAAACCATAACACTA-3′ (SEQ ID NO: 23) 5′GTATATGGGTATTTTTTGGAGGT-3′ (SEQ ID NO: 24) 5′ CCATAACACTAAAACCCTCAA-3′(SEQ ID NO: 25) 5′-GGGAATGTTTATTTATGTATGAAG-3′ (SEQ ID NO: 26) 5′TAAAAACCTCCTCCACCTCC-3′ (SEQ ID NO: 27) 5′ TAATTTATTTAGGGTGGTGTT-3′(SEQ ID NO: 28) 5′ TCCAAACACCCCCACCTTAA-3′ (SEQ ID NO: 29) 5′GTATAGGTATTTTTGGAGGTTTTTTA 3′ (SEQ ID NO: 30) 5′CCTAAAATAAATCAAACACATAACCC 3′ (SEQ ID NO: 31) 5′GAGGTTTTTTATTTTAGTTTTGG-3′ (SEQ ID NO: 32) 5′ ACTATAATATATAAACCTACAC 3′

Embodiments of the present invention are based on the finding of anassociation between a change in the ratio of undifferentiated todifferentiated cells in a sample. This change may optionally becorrelated with a loss of imprinting (LOI) of the IGF2 gene and familyhistory of colorectal cancer (CRC). Accordingly, the present inventionrelates to a method for identifying an increased risk of developingcancer in a subject. The method includes analyzing a biological samplefrom the subject for a change in the ration of undifferentiated todifferentiated cells in a sample. Such a change can be indicative of anincreased risk of developing cancer. Certain embodiments of theinvention may further include analyzing genomic DNA for alteredmethylation of the IGF2 gene or the H19 gene. The method for example,includes analyzing genomic DNA from the sample for hypomethylation ofthe IGF2 gene or the H19 gene.

A method according to the present invention can be performed duringroutine clinical care, for example as part of a general regular checkup,on a subject having no apparent or suspected neoplasm such as cancer.Therefore, the present invention in certain embodiments, provides ascreening method for the general population. The methods of the presentinvention can be performed at a younger age than present cancerscreening assays, for example where the method can be performed on asubject under 65, 55, 50, 40, 35, 30, 25, or 20 years of age.

If the biological sample of the subject in question is found to exhibita change in the ratio of undifferentiated to differentiated cells in thesame or different sample from the subject as compared to a referenceratio, then that subject is identified as having an increasedprobability of having cancer. In these embodiments, further diagnostictests may be carried out to probe for the possibility of cancer beingpresent in the subject. Examples of such further diagnostic testsinclude, but are not limited to, chest X-ray, carcinoembryonic antigen(CEA) or prostate specific antigen (PSA) level determination, colorectalexamination, endoscopic examination, MRI, CAT scanning, or other imagingsuch as gallium scanning, and barium imaging. Furthermore, the method ofthe invention can be coincident with routine sigmoidoscopy/colonoscopyof the subject. The method could involve use of a very thin tube, or adigital exam to obtain a colorectal sample. Additional diagnostic testsfor LOI of specific genes can be performed.

According to the present invention, the biological or tissue sample canbe drawn from any tissue that is susceptible to cancer. For example, thetissue may be obtained by surgery, biopsy, swab, stool, or othercollection method. The biological sample for methods of the presentinvention can be, for example, a sample from colorectal tissue, or incertain embodiments, can be a blood sample, or a fraction of a bloodsample such as a peripheral blood lymphocyte (PBL) fraction. Methods forisolating PBLs from whole blood are well known in the art. An example ofsuch a method is provided in the Example section herein. In addition, itis possible to use a blood sample and enrich the small amount ofcirculating cells from a tissue of interest, e.g., colon, breast, etc.using a method known in the art.

An exemplary method of screening a subject for a predisposition to coloncancer may include scraping an area associated with the large intestinewith a spatula similar to the techniques used to obtain cells for a Papsmear. The scraped cells may then be smeared onto a slide, fixed withcompounds suitable for distinguishing undifferentiated cells fromdifferentiated cells, and subsequently analyzed under a microscope.Image analysis technology has been developed which may fully automatethis process.

Alternatively, the cells obtained from scrape specimen may be introducedinto a liquid based transport medium having properties suitable formaintaining the cells in solution while allowing for the detection ofbiomarkers associated with the undifferentiated/differentiated state ofthe cells. For example, the transport medium may include any combinationof the following: 1) a fixative that helps retain cellular morphologyand allow cells to retain the ability to be analyzed for biomarkers bymolecular methods; 2) an isotonic osmolarity medium to maintain cellularvolumetric integrity; 3) a mucolytic agent to disrupt mucous; 4) a bloodlysing agent such as ammonium chloride or acidic acid; 5) a cellularpreservative; 6) a cellular ion agent to break up groups of, forexample, colon cells so that they can be analyzed individually; 7) ananticoagulant such as heparin sodium; and/or 8) a stain to allow forcellular detection.

Once prepared and contacted with compounds suitable for detectingbiomarker(s) that facilitate distinguishing undifferentiated fromdifferentiated cells, the specimen may be analyzed via flow cytometrytechnology. Flow cytometry is a process by which cells pass singly in afluid stream. The exact methods of achieving this may vary. It may beachieved by suspending the cells in isotonic fluid medium andintroducing it into a nozzle shaped chamber with a small exit diameter.The ratio of undifferentiated to differentiated cells can be comparedagainst, for example, a threshold value as described above. Biomarkerswhich may be used to distinguish undifferentiated cells fromdifferentiated cells are discussed elsewhere in this disclosure. Thismethod can optionally be coupled with detecting LOI in the tissueobtained from the subject.

It is understood that the present invention can be performed on thegeneral population to assess the presence or risk of disease. In anotherembodiment of the present invention, target patients may be tested todetect a particular type of disease, for example colon cancer. Inaddition, according to the present invention, subgroups of thosepatients who already are thought to be at some increased risk, such as,e.g., a weak family history, may be tested.

In general, an exemplary kit of the present invention will containcompounds suitable for detecting biomarker(s) associated withdifferentiation state of a cell. The biomarker can include, but is notlimited to, Shh (Sonic hedgehog), Tcf4, Lef1, Twist, EphB2, EphB3, Hes1,Notch1, Hoxa9, Dkk1, Tle6, Tcf3, Bmi1, Kit, Musashi1 (Msi1), Cdx1, Hes5,Oct4, Ki-67, β-catenin, Noggin, BMP4, PTEN (phosphorylated PTEN), Akt(phosphorylated Akt), Villin, Aminopeptidase N (anpep), Sucraseisomaltase (SI), Ephrin-B1 (EfnB1), Cdx2, Crip, Apoa1, Aldh1b1, Calb3,Dgat1, Dgat2, Clu, Hephaestin, Gas1, Ihh (Indian hedgehog), Intrinsicfactor B12 receptor, IFABP, or KLF4. As discussed below, a biomarker canbe detected using proteomic and microarray techniques. The equipment,instructions and reagents necessary for detecting a celldifferentiation-related biomarker can be included a kit of theinvention.

As described above, a kit may optionally include one or more probes orprimers which can identify a specific imprinted gene or group of genes.Typically, such probes will be nucleic acids or monoclonal antibodiesand will be linked to, for example, a fluorescent label. In the case ofdetecting LOI by relying on the differential rates of transcription oftwo polymorphic alleles, the kit may comprise means for theamplification of the mRNAs corresponding to the two polymorphic allelesof the gene in question. Examples of such means include suitable DNAprimers for the PCR amplification of the mRNAs corresponding to the twopolymorphic alleles of the gene in question. Specific examples of suchmeans includes any pair of DNA primers which will anneal to and amplifyany gene which is normally imprinted and in which a polymorphism ispresent. The kit may further include means for identifying the productsof the amplification of the mRNAs corresponding to the two polymorphicalleles of the gene in question. Such means include, but is not limitedto, a restriction enzyme which specifically cleaves one of the productsof the amplification of the mRNAs corresponding to the two polymorphicalleles of the gene in question. Specific examples of such enzymesinclude, but are not limited to, Apa I in the case of the IGF2 gene. Asdescribed below, a kit of the invention may optionally include devices,reagents and/or instructions for testing a sample using proteomic andmicroarray technology.

Proteomics and Microarrays

Proteomics provides methods for predicting an increased risk ofdeveloping a cell proliferation or neoplastic disorder in a subject wellbefore neoplastic tissue is identified in the subject. Proteomics is anevolving technology capable of testing for the presence of minuteamounts of a vast array of proteins using small samples of human tissue.Using proteomic tools, increased or decreased levels of certain proteinsin a biological sample such as intestinal tissue urine or serum, can beascertained. In addition, using mathematical algorithms a complexproteome or “fingerprint” can be obtained. As previously noted, suchalgorithms include “factor analyses” and “principle component analysis(PCA).” The proteome can consist of a group of proteins, some increasedin concentration from normal and others decreased, that are indicativeof an increased risk of developing a cell proliferation or neoplasticdisorder, such as those associated with colon or pancreatic cancer.

Thus, in another embodiment, a method of determining whether a subjectis predisposed to developing a cell proliferation or neoplastic disorderusing proteomic and/or microarray technology is provided. The method caninclude obtaining a biological sample from a subject and contacting thesample with an array of immobilized biomolecules that specificallyinteract with a biomarker indicative of a differentiated orundifferentiated cell. The method may further include obtaining asubject profile by detecting a modification of the biomolecules that isindicative of the ratio of differentiated to undifferentiated cells inthe sample and comparing the subject profile with a reference profile.Generally, the reference profile includes one or more values, each valuerepresenting the level of biomarker in a reference sample obtained fromone or more reference subjects displaying normal imprinting of a targetgene. Optionally, the method includes identifying, in the same ordifferent sample, cells displaying abnormal expression of at least onetarget gene in a normal biological sample from the subject.

A “subject” profile is generally described as a “test” profile. Asubject profile can be generated from a sample taken from a subject inorder to identify the subject's risk of developing a cell proliferationor neoplastic disorder. Thus, a “subject” profile is generated from asubject being tested for a predisposition to such a disorder. Thesubject profile can include, for example, the previously discussed ratioobtained from identifying differentiated and undifferentiated cells in asample. In general, a “reference” profile can be described as a“control” profile. A reference profile can be generated from a sampletaken from a particular tissue of a normal individual, or series ofindividuals, or those having a cell proliferation or neoplasticdisorder. The reference profile, or plurality of reference profiles, canbe used to establish threshold values for the levels of, for example,specific levels of biomarkers in a particular tissue sample, such asthose associated with epithelial cells obtained from crypts of theintestinal lumen. A “reference” profile can include a profile generatedfrom normal subjects or a profile generated from subjects having a cellproliferative disorder. As previously noted, subject profiles andreference profiles can be expressed as an array “signature” or “pattern”of specific identifiable biomarkers. The array signature can becolor-coded as in for easy visual or computer-aided identification. Thesignature can also be described as a number(s) that correspond to valuesattributed to the biomarkers identified by the array.

The invention provides an array (i.e., “biochip” or “microarray”) thatincludes immobilized biomolecules that facilitate the detection of aparticular molecule or molecules in a biological sample. Biomoleculesthat identify the biomarkers described above (e.g., biomarkers thatdistinguish differentiated from undifferentiated cells) can be includedin a custom array for detecting subjects predisposed to a cellproliferation or neoplastic disorder. For example, a custom array caninclude biomolecules that identify villin or twist. Arrays comprisingbiomolecules that specifically identify selected biomarkers can be usedto develop a database of information using data provided in the presentspecification. Additional biomolecules that identify factors related tocellular differentiation which lead to improved cross-validated errorrates in multivariate prediction models (e.g., logistic regression,discriminant analysis, or regression tree models) can be included in acustom array of the invention.

The term “array,” as used herein, generally refers to a predeterminedspatial arrangement of binding islands, biomolecules, or spatialarrangements of binding islands or biomolecules. Arrays according to thepresent invention that include biomolecules immobilized on a surface mayalso be referred to as “biomolecule arrays.” Arrays according to thepresent invention that comprise surfaces activated, adapted, prepared,or modified to facilitate the binding of biomolecules to the surface mayalso be referred to as “binding arrays.” Further, the term “array” maybe used herein to refer to multiple arrays arranged on a surface, suchas would be the case where a surface bore multiple copies of an array.Such surfaces bearing multiple arrays may also be referred to as“multiple arrays” or “repeating arrays.” The use of the term “array”herein may encompass biomolecule arrays, binding arrays, multiplearrays, and any combination thereof; the appropriate meaning will beapparent from context. An array can include biomolecules thatdistinguish differentiated from undifferentiated cells. The biologicalsample can include fluid or solid samples from any tissue of the bodyincluding excretory fluids such as urine.

An array of the invention comprises a substrate. By “substrate” or“solid support” or other grammatical equivalents, herein is meant anymaterial appropriate for the attachment of biomolecules and is amenableto at least one detection method. As will be appreciated by those in theart, the number of possible substrates is very large. Possiblesubstrates include, but are not limited to, glass and modified orfunctionalized glass, plastics (including acrylics, polystyrene andcopolymers of styrene and other materials, polypropylene, polyethylene,polybutylene, polyurethanes, Teflon, etc.), polysaccharides, nylon ornitrocellulose, resins, silica or silica-based materials includingsilicon and modified silicon, carbon, metals, inorganic glasses,plastics, ceramics, and a variety of other polymers. In addition, as isknown the art, the substrate may be coated with any number of materials,including polymers, such as dextrans, acrylamides, gelatins or agarose.Such coatings can facilitate the use of the array with a biologicalsample derived from urine or serum.

A planar array of the invention will generally contain addressablelocations (e.g., “pads”, “addresses” or “micro-locations”) ofbiomolecules in an array format. The size of the array will depend onthe composition and end use of the array. Arrays containing from about 2different biomolecules to many thousands can be made. Generally, thearray will comprise from two to as many as 100,000 or more, depending onthe end use of the array. A microarray of the invention will generallycomprise at least one biomolecule that identifies or “captures” abiomarker, such as, for example, villin, ephrin-B1, musashi1, or twist,or antagonist thereof, present in a biological sample. In someembodiments, the compositions of the invention may not be in an arrayformat; that is, for some embodiments, compositions comprising a singlebiomolecule may be made as well. In addition, in some arrays, multiplesubstrates may be used, either of different or identical compositions.Thus, for example, large planar arrays may comprise a plurality ofsmaller substrates.

As an alternative to planar arrays, bead based assays in combinationwith flow cytometry have been developed to perform multiparametricimmunoassays. In bead based assay systems the biomolecules can beimmobilized on addressable microspheres. Each biomolecule for eachindividual immunoassay is coupled to a distinct type of microsphere(i.e., “microbead”) and the immunoassay reaction takes place on thesurface of the microspheres. Dyed microspheres with discretefluorescence intensities are loaded separately with their appropriatebiomolecules. The different bead sets carrying different capture probescan be pooled as necessary to generate custom bead arrays. Bead arraysare then incubated with the sample in a single reaction vessel toperform the immunoassay.

Product formation of the biomarker with their immobilized capturebiomolecules can be detected with a fluorescence based reporter system.Biomarkers can either be labeled directly by a fluorogen or detected bya second fluorescently labeled capture biomolecule. The signalintensities derived from captured biomarkers are measured in a flowcytometer. The flow cytometer first identifies each microsphere by itsindividual color code. Second the amount of captured biomarkers on eachindividual bead is measured by the second color fluorescence specificfor the bound target. This allows multiplexed quantitation of multipletargets from a single sample within the same experiment. Sensitivity,reliability and accuracy are compared to standard microtiter ELISAprocedures. With bead based immunoassay systems cytokines can besimultaneously quantified from biological samples. An advantage of beadbased systems is the individual coupling of the capture biomolecule todistinct microspheres.

Thus, microbead array technology can be used to sort celldifferentiation markers, bound to a specific biomolecule using aplurality of microbeads, each of which can carry about 100,000 identicalmolecules of a specific anti-tag biomolecule on the surface of amicrobead. Once captured, the biomarker can be handled as fluid,referred to herein as a “fluid microarray.”

An array of the present invention encompasses any means for detecting abiomarker molecule such as a cell differentiation marker, or antagonistthereof. For example, microarrays can be biochips that providehigh-density immobilized arrays of recognition molecules (e.g.,antibodies), where biomarker binding is monitored indirectly (e.g., viafluorescence). In addition, an array can be of a format that involvesthe capture of proteins by biochemical or intermolecular interaction,coupled with direct detection by mass spectrometry (MS).

Arrays and microarrays that can be used with the new methods to detectthe biomarkers described herein can be made according to the methodsdescribed in U.S. Pat. Nos. 6,329,209; 6,365,418; 6,406,921; 6,475,808;and 6,475,809, and U.S. patent application Ser. No. 10/884,269, whichare incorporated herein in their entirety. New arrays, to detectspecific selections of sets of biomarkers described herein can also bemade using the methods described in these patents.

In many embodiments, immobilized biomolecules, or biomolecules to beimmobilized, are proteins. One or more types of proteins may beimmobilized on a surface. In certain embodiments, the proteins areimmobilized using methods and materials that minimize the denaturing ofthe proteins, that minimize alterations in the activity of the proteins,or that minimize interactions between the protein and the surface onwhich they are immobilized.

Surfaces useful according to the present invention may be of any desiredshape (form) and size. Non-limiting examples of surfaces include chips,continuous surfaces, curved surfaces, flexible surfaces, films, plates,sheets, tubes, and the like. Surfaces preferably have areas ranging fromapproximately a square micron to approximately 500 cm². The area,length, and width of surfaces according to the present invention may bevaried according to the requirements of the assay to be performed.Considerations may include, for example, ease of handling, limitationsof the material(s) of which the surface is formed, requirements ofdetection systems, requirements of deposition systems (e.g., arrayers),and the like.

In certain embodiments, it is desirable to employ a physical means forseparating groups or arrays of binding islands or immobilizedbiomolecules: such physical separation facilitates exposure of differentgroups or arrays to different solutions of interest. Therefore, incertain embodiments, arrays are situated within wells of 96, 384, 1536,or 3456 microwell plates. In such embodiments, the bottoms of the wellsmay serve as surfaces for the formation of arrays, or arrays may beformed on other surfaces then placed into wells. In certain embodiments,such as where a surface without wells is used, binding islands may beformed or biomolecules may be immobilized on a surface and a gaskethaving holes spatially arranged so that they correspond to the islandsor biomolecules may be placed on the surface. Such a gasket ispreferably liquid tight. A gasket may be placed on a surface at any timeduring the process of making the array and may be removed if separationof groups or arrays is no longer necessary.

The immobilized biomolecules can bind to molecules present in abiological sample overlying the immobilized biomolecules. Alternatively,the immobilized biomolecules modify or are modified by molecules presentin a biological sample overlying the immobilized biomolecules. Forexample, a cell differentiation marker present in a biological samplecan contact an immobilized biomolecule and bind to it, therebyfacilitating detection of the marker. Alternatively, the celldifferentiation marker, or antagonist thereof, can contact a biomoleculeimmobilized on a solid surface in a transient fashion and initiate areaction that results in the detection of the marker absent the stablebinding of the marker to the biomolecule.

Modifications or binding of biomolecules in solution or immobilized onan array may be detected using detection techniques known in the art.Examples of such techniques include immunological techniques such ascompetitive binding assays and sandwich assays; fluorescence detectionusing instruments such as confocal scanners, confocal microscopes, orCCD-based systems and techniques such as fluorescence, fluorescencepolarization (FP), fluorescence resonant energy transfer (FRET), totalinternal reflection fluorescence (TIRF), fluorescence correlationspectroscopy (FCS); colorimetric/spectrometric techniques; surfaceplasmon resonance, by which changes in mass of materials adsorbed atsurfaces may be measured; techniques using radioisotopes, includingconventional radioisotope binding and scintillation proximity assays so(SPA); mass spectroscopy, such as matrix-assisted laserdesorption/ionization mass spectroscopy (MALDI) and MALDI-time of flight(TOF) mass spectropscopy; ellipsometry, which is an optical method ofmeasuring thickness of protein films; quartz crystal microbalance (QCM),a very sensitive method for measuring mass of materials adsorbing tosurfaces; scanning probe microscopies, such as AFM and SEM; andtechniques such as electrochemical, impedance, acoustic, microwave, andIR/Raman detection. See, e.g., Mere L, et al., “Miniaturized FRET assaysand microfluidics: key components for ultra-high-throughput screening,”Drug Discovery Today 4(8):363-369 (1999), and references cited therein;Lakowicz J R, Principles of Fluorescence Spectroscopy, 2nd Edition,Plenum Press (1999).

Arrays of the invention suitable for identifying an increased risk ofdeveloping a cell proliferation or neoplastic disorder may be includedin kits. Such kits may also include, as non-limiting examples, reagentsuseful for preparing biomolecules for immobilization onto bindingislands or areas of an array, reagents useful for detectingmodifications to immobilized biomolecules, or reagents useful fordetecting binding of biomolecules from solutions of interest toimmobilized biomolecules, and instructions for use. Thus, in anotherembodiment, a diagnostic kit for detecting a cell proliferation orneoplastic disorder, or a predisposition to a cell proliferation orneoplastic disorder, is provided. Such kits can include a means foridentifying a subject comprising cells displaying abnormal imprinting ofat least one target gene and an array for detecting a biomarkerindicative of a differentiated or undifferentiated cells, the arraycomprising a substrate having a plurality of addresses, each addresshaving disposed thereon an immobilized biomolecule. Each biomolecule canindividually detect a biomarker indicative of a differentiated orundifferentiated cells. As will be discussed below, in addition toidentifying subjects predisposed to developing a neoplastic disorder,methods provided herein can be used to follow the progress of a subjectundergoing treatment for such a disorder.

Theranostics

The invention provides compositions and methods for the identificationof a predisposition to a cell proliferation or neoplastic disorder suchthat a theranostic approach can be taken to test such individuals todetermine the effectiveness of a particular therapeutic intervention(pharmaceutical or non-pharmaceutical) and to alter the interventionto 1) reduce the risk of developing adverse outcomes and 2) enhance theeffectiveness of the intervention. Thus, in addition to diagnosing orconfirming the presence of or risk for a gestational disorder, themethods and compositions of the invention also provide a means ofoptimizing the treatment of a subject having such a disorder. Theinvention provides a theranostic approach to treating a cellproliferation or neoplastic disorder by integrating diagnostics andtherapeutics to improve the real-time treatment of a subject having, forexample, LOI of the IGF2 gene. Practically, this means creating teststhat can identify which patients are most suited to a particulartherapy, and providing feedback on how well a drug is working tooptimize treatment regimens. In the area of diseases associated withcell proliferation or neoplastic disorders, theranostics can flexiblymonitor changes in important parameters (e.g., an increase or decreasein the ratio of differentiated vs. undifferentiated cells in a tissuesample) over time. For example, theranostic multiparameter immunoassaysspecific for a series of diagnostically relevant molecules such as thosethat distinguish differentiated from undifferentiated cells can be usedto follow the progress of a subject undergoing treatment for theprevention of colon cancer.

Within the clinical trial setting, a theranostic method or compositionof the invention can provide key information to optimize trial design,monitor efficacy, and enhance drug safety. For instance, “trial design”theranostics can be used for patient stratification, determination ofpatient eligibility (inclusion/exclusion), creation of homogeneoustreatment groups, and selection of patient samples that arerepresentative of the general population. Such theranostic tests cantherefore provide the means for patient efficacy enrichment, therebyminimizing the number of individuals needed for trial recruitment.“Efficacy” theranostics are useful for monitoring therapy and assessingefficacy criteria. Finally, “safety” theranostics can be used to preventadverse drug reactions or avoid medication error.

Statistical Analyses

The data presented herein provides a database of information related todiagnosing cell proliferation or neoplastic disorders. Prediction rulescan be selected based on cross-validation, and further validating thechosen rule on a separate cohort. A variety of approaches can be used togenerate data predictive of a cell proliferation or neoplastic disorderbased on cell differentiation marker levels provided herein, includingdiscriminant analysis, logistic regression, and regression trees.

Discriminant analysis attempts to find a plane in the multivariate spaceof the marker data such that, to the extent possible, cases appear onone side of this plane, and controls on the other. The coefficientswhich determine this plane constitute a classification rule: a linearfunction of the marker values which is compared with a threshold. InBayesian classification, information on the probability of a subjectbeing a case (i.e., a subject having, or predisposed to having, a cellproliferation or neoplastic disorder) that is known before the data areobtained can be employed. For example the prior probability of being acase can be set to about 0.5; for a screening test applied to a generalpopulation the corresponding probability will be approximately 0.05. Asubject is classified as having, or at risk of having, a complication(i.e., a cell proliferation or neoplastic disorder) if the correspondingposterior probability (i.e., the prior probability updated using thedata) exceeds 0.5.

Additional patient information (e.g., LOI and/or family history) can becombined with the cell differentiation markers provided herein. Thesedata can be combined in a database that analyzes the information toidentify trends that complement the present biomarker data. Results canbe stored in an electronic format.

Additional analyses can be performed to identify subjects at risk forcell proliferation or neoplastic disorders such as colon cancer orpancreatic cancer. Such analyses include bivariate analysis of each ofthe primary exposures, multivariate models including variables with astrong relationship (biologic and statistical) with outcomes, methods toaccount for multiple critical exposures including variable reductionusing factor analysis, and prediction models.

For bivariate analysis, the mean level of each primary exposure betweencases and controls using a 2-sample t-test or Wilcoxon Rank Sum test, asappropriate, can be conducted. If the association appears linear, trendcan be analyzed using the Mantel Haenszel test. Data can be assembledinto less fine categories (e.g., tertiles) using the distribution of thecontrols, and examine these as indicator variables in multivariableanalysis.

For multivariate analyses, data can be correlated between two controlgroups, one matched and another not matched. In both matched andunmatched analyses, the independent effects of all primary exposures ofinterest can be examined using logistic regression (with conditionalmodels in matched analyses) models. The models can include a minimumnumber of covariates to test the main effect of specific predictors.

Databases and Computerized Methods of Analyzing Data

A database generated from the methods provided herein and the analysesdescribed above can be included in, or associated with, a computersystem for determining whether a subject has, or is predisposed tohaving, a cell proliferation or neoplastic disorder. The database caninclude a plurality of digitally-encoded “reference” (or “control”)profiles. Each reference profile of the plurality can have a pluralityof values, each value representing a level of a biomarker in a sample.Alternatively, a reference profile can be derived from an individualthat is normal. Both types of profiles can be included in the databasefor consecutive or simultaneous comparison to a subject profile. Thecomputer system can include a server containing a computer-executablecode for receiving a profile of a subject and identifying from thedatabase a matching reference profile that is diagnostically relevant tothe subject profile. The identified profile can be supplied to acaregiver for diagnosis or further analysis.

Thus, the various techniques, methods, and aspects of the inventiondescribed above can be implemented in part or in whole usingcomputer-based systems and methods. Additionally, computer-based systemsand methods can be used to augment or enhance the functionalitydescribed above, increase the speed at which the functions can beperformed, and provide additional features and aspects as a part of orin addition to those of the invention described elsewhere in thisdocument. Various computer-based systems, methods and implementations inaccordance with the above-described technology are presented below.

A processor-based system can include a main memory, preferably randomaccess memory (RAM), and can also include a secondary memory. Thesecondary memory can include, for example, a hard disk drive and/or aremovable storage drive, representing a floppy disk drive, a magnetictape drive, an optical disk drive, etc. The removable storage drivereads from and/or writes to a removable storage medium. Removablestorage medium refers to a floppy disk, magnetic tape, optical disk, andthe like, which is read by and written to by a removable storage drive.As will be appreciated, the removable storage medium can comprisecomputer software and/or data.

In alternative embodiments, the secondary memory may include othersimilar means for allowing computer programs or other instructions to beloaded into a computer system. Such means can include, for example, aremovable storage unit and an interface. Examples of such can include aprogram cartridge and cartridge interface (such as the found in videogame devices), a removable memory chip (such as an EPROM or PROM) andassociated socket, and other removable storage units and interfaces,which allow software and data to be transferred from the removablestorage unit to the computer system.

The computer system can also include a communications interface.Communications interfaces allow software and data to be transferredbetween computer system and external devices. Examples of communicationsinterfaces can include a modem, a network interface (such as, forexample, an Ethernet card), a communications port, a PCMCIA slot andcard, and the like. Software and data transferred via a communicationsinterface are in the form of signals, which can be electronic,electromagnetic, optical or other signals capable of being received by acommunications interface. These signals are provided to communicationsinterface via a channel capable of carrying signals and can beimplemented using a wireless medium, wire or cable, fiber optics orother communications medium. Some examples of a channel can include aphone line, a cellular phone link, an RF link, a network interface, andother communications channels.

In this document, the terms “computer program medium” and “computerusable medium” are used to refer generally to media such as a removablestorage device, a disk capable of installation in a disk drive, andsignals on a channel. These computer program products are means forproviding software or program instructions to a computer system.

Computer programs (also called computer control logic) are stored inmain memory and/or secondary memory. Computer programs can also bereceived via a communications interface. Such computer programs, whenexecuted, enable the computer system to perform the features of theinvention as discussed herein. In particular, the computer programs,when executed, enable the processor to perform the features of theinvention. Accordingly, such computer programs represent controllers ofthe computer system.

In an embodiment where the elements are implemented using software, thesoftware may be stored in, or transmitted via, a computer programproduct and loaded into a computer system using a removable storagedrive, hard drive or communications interface. The control logic(software), when executed by the processor, causes the processor toperform the functions of the invention as described herein.

In another embodiment, the elements are implemented primarily inhardware using, for example, hardware components such as PALs,application specific integrated circuits (ASICs) or other hardwarecomponents. Implementation of a hardware state machine so as to performthe functions described herein will be apparent to person skilled in therelevant art(s). In yet another embodiment, elements are implanted usinga combination of both hardware and software.

In another embodiment, the computer-based methods can be accessed orimplemented over the World Wide Web by providing access via a Web Pageto the methods of the invention. Accordingly, the Web Page is identifiedby a Universal Resource Locator (URL). The URL denotes both the servermachine and the particular file or page on that machine. In thisembodiment, it is envisioned that a consumer or client computer systeminteracts with a browser to select a particular URL, which in turncauses the browser to send a request for that URL or page to the serveridentified in the URL. Typically the server responds to the request byretrieving the requested page and transmitting the data for that pageback to the requesting client computer system (the client/serverinteraction is typically performed in accordance with the hypertexttransport protocol (“HTTP”)). The selected page is then displayed to theuser on the client's display screen. The client may then cause theserver containing a computer program of the invention to launch anapplication to, for example, perform an analysis according to theinvention.

The invention is further described in the following examples, whichserve to illustrate but not to limit the scope of the inventiondescribed in the claims.

EXAMPLES

A mouse model was created to investigate the mechanism by which LOI ofIGF2 contributes to intestinal tumorigenesis. Previous analyses of mousemodels by other groups have shown that Igf2 is activated more than25-fold in pancreatic tumors induced by the SV40 large T antigen(Christofori, et al., Nat. Genet. 10, 196 (1995)) and that forcedoverexpression of Igf2 causes intestinal tumor formation andhyperproliferation of crypt epithelium (Hassan and Howell, Cancer Res.60, 1070 (2000); Bennett, et al., Development 130, 1079 (2003)). Themodel provided herein was designed to mimic the human situation, whereLOI causes only a modest increase in IGF2 expression. Imprinting of Igf2is regulated by a differentially methylated region (DMR) upstream of thenearby untranslated H19 gene. Deletion of the DMR leads to biallelicexpression (LOI) of Igf2 in the offspring when the deletion ismaternally inherited (FIGS. 3A-3C). To model intestinal neoplasia, weused Min mice with an Apc mutation (Su et al., Science 256, 668 (1992)).We crossed female H19+/− with male Apc+/Min, comparing littermatesharboring Apc mutations with or without a maternally inherited H19deletion, and thus with or without LOI. In comparison with H19+/+[hereafter referred to as LOI(−) mice], the H19−/+ mutant mice[hereafter referred to as LOI(+) mice] showed an approximate doubling inIgf2 mRNA levels that did not vary with age or Min status (FIGS. 4A-4B).This is consistent with the 2 to 3-fold increase in Igf2 mRNA levels innormal human colonic mucosa or Wilms tumors that are LOI(+) (Ravenel etal., J. Natl. Cancer Inst. 93, 1698 (2001)). The level of Igf2 proteinwas also doubled in the intestine of LOI(+) mice (FIGS. 4A-4B). TheLOI(+) mice developed about twice as many adenomas in both smallintestine and colon as did the LOI(−) mice, and this difference wasstatistically significant (Table 1). Mice with LOI also had longerintestinal crypts, the site of epithelial stem cell renewal (Sell andPierce, Lab. Invest. 70, 6 (1994)) (FIGS. 5A-5B). This increase inlength was specific to the crypts, progressed over time [1.2-foldincrease (P<0.01) in mice at 42 days of age and 1.5-fold increase(P<0.0001) in mice at 120 days], and was independent of Apc status. Theincrease in crypt length was not due to differences in cellproliferation, as there was no statistically significant difference inproliferating cell nuclear antigen labeling index between LOI(+) andLOI(−) Min mice (3.8±0.9 vs. 3.1±1.5, respectively), nor was there adifference in the distribution (Lipkin and Deschner, Cancer Res 36, 2665(1976)) of proliferative cells within the crypt (0.39±0.04 vs.0.38±0.03, respectively, P=N.S.). The LOI(+) and LOI(−) mice showed nodifference in crypt apoptotic rates, as assessed histomorphologicallyand by in situ TUNEL assay; both genotypes had an average of 1 apoptoticcell per 20 crypts. There was also no difference in the rate ofbranching of intestinal crypts; both LOI(+) and LOI(−) mice had 1-2total branched crypts below the intestinal surface.

Increased crypt length of the small intestine correlates with a shift inthe ratio of undifferentiated to differentiated epithelial cells in themucosa. Four antigens were immunostained to distinguish undifferentiatedversus differentiated epithelial cell development: villin, a structuralcomponent of the brush border cytoskeleton in gastrointestinal tractepithelia (West et al., Gastroenterol. 94, 343 (1988)); ephrin-B1, theligand of the EphB2/EphB3 receptors that play a role in allocatingepithelial cells within the crypt-villus axis in intestinal epithelium(Batlle et al., Cell 111, 251 (2002)); musashi1, an RNA-binding proteinselectively expressed in neural and intestinal progenitor cells and keyto maintaining the stem cell state (Kaneko et al., Dev. Neurosci. 22,139 (2000); Potten et al., Differentiation 71, 28 (2003)); and twist, atranscriptional factor of the basic helix-loop-helix family originallyidentified as a mesodermal progenitor cell marker (Borkowski, et al.,Development 121, 4183 (1995)) that is also involved in loss ofdifferentiation of epithelial cells (Howe, et al., Cancer Res. 63, 1906(2003); Thiery and Morgan, Nat. Med. 10, 777 (2004)).

FIGS. 1A-1F depict immunohistochemical analysis of villin and musashi1in 120 day old LOI(−) and LOI(+) mice. FIG. 1A shows, in LOI(−) mice,villin protein expression is noted in a cytoplasmic distributionthroughout differentiated enterocytes lining intestinal villi and withinthe crypt-villus interface. FIG. 1B shows, in LOI(+) mice, villinexpression is markedly decreased. FIG. 1C shows, in LOI(−) mice,musashi1 expression is detected within the cytoplasm and nuclei in rarecells within intestinal crypts (arrow), the location of intestinal stemcells and the undifferentiated epithelial cell compartment. FIG. 1Dshows that, in contrast to FIG. 1C, musashi1 cytoplasmic and nuclearlabeling is detected throughout the intestinal crypts of LOI(+) mice.FIG. 1E shows that, in LOI(−) mice, rare musashi1-positive cells aredetected within the overlying intestinal villi representing thedifferentiated epithelial compartment. FIG. 1F shows that, in LOI(+)mice, intense cytoplasmic and nuclear expression of musashi1 is detectedwithin enterocytes lining intestinal villi. Scale bars correspond to 10μm.

Consistent with their biologic roles in differentiated enterocytes,immunostaining for both villin and ephrin-B1 were detected within thecytoplasm of enterocytes lining the villi of the small intestine andwithin the villus-crypt interface in LOI(−) mice (FIG. 1A) (FIG. 6). TheLOI(+) mice, in contrast, showed lower levels of villin and ephrin-B1and a contraction of the differentiated epithelial cell compartment(FIG. 1B) (FIGS. 6A-6D).

Expression of the progenitor cell marker musashi1 was observed inscattered cells within the lower half of intestinal crypts in LOI(−)mice (FIG. 1C), whereas numerous musashi1-positive cells were identifiedwithin the intestinal crypts of LOI(+) mice (FIG. 1D). The LOI(+) micealso showed intense staining within enterocytes lining the intestinalvilli compared with LOI(−) mice (FIGS. 1E-1F). A semi-quantitativeanalysis confirmed increased musashi1 staining in the LOI(+) mice,independent of Apc status (Table 2). Immunostaining for twist alsorevealed a marked increase in the number and intensity ofpositively-staining cells in the crypts of LOI(+) mice (FIGS. 7A-7F).These changes were progressive over time (see e.g., FIGS. 1A-1F, 6A-6Dand 7A-7F).

Because this shift affects normal mucosa, one prediction of thisde-differentiation model is that the increased number of adenomas is dueto an increase in tumor initiation rather than an increase in tumorprogression. Supporting this idea, there was no difference in the ratioof microadenomas [<5 crypts each, (Torrance et al., Nat. Med. 6, 1024(2000))] to macroadenomas (≧5 crypts each) between LOI(+) Min mice (36micro/27 macro) and LOI(−) Min mice (16 micro/14 macro) at 120 days. Anindependent mouse model of LOI, in which point mutations had beenintroduced in three of the four CCCTC-binding factor (CTCF) target siteswithin the H19 DMR (Pant et al., Genes Dev. 17, 586 (2003)) (FIGS. 3A-3c and FIGS. 8A-8F), was also examined by immunostaining Anotheradvantage of this model is that, unlike the deletion model, H19expression is intact in the DMR mutation model (FIGS. 9A-9F). Loss ofH19 might have independent effects given its known role on mRNAtranslation in trans. Nevertheless, a shift in the ratio ofdifferentiated to undifferentiated cells was also seen in the normalepithelium of these LOI(+) mice. For example, FIGS. 2A-2H, depict ashift to less differentiated colon epithelium in a mouse H19 DMRmutation model and in colonoscopy clinic patients with LOI. Musashi1immunostaining in LOI(−) mice shows rare crypt epithelial cells withcytoplasmic labeling (FIG. 2A), compared with LOI(+) mice (FIG. 2B),which show aberrant musashi1 staining in both a cytoplasmic and nuclearpattern throughout the colonic epithelium. FIG. 2C shows that villinimmunostaining in LOI(−) mice shows cytoplasmic labeling including thebrush border. In contrast, in LOI(+) mice (FIG. 2D), villin staining ofthe brush border on the surface epithelial cells is absent. FIG. 2Eshows that in 12 colonoscopy patients without LOI, raremusashi1-positive cells are detected in crypt epithelial cells (arrow).Low power view is available in FIGS. 10A-10B. In contrast, FIG. 2F showsthat in colonoscopy patients with LOI, musashi1 labeling is presentthroughout colonic crypts with extension to the surface epithelium (seealso FIGS. 10A-10B). In colonoscopy patients without LOI, only weaklabeling for twist is detected (see FIG. 2G). In colonoscopy patientswith LOI, patchy but strong twist labeling is present in the crypt andsurface epithelium (see FIG. 2H). Scale bars correspond to 10 μm.

FIGS. 3A-3C, depict mouse models of H19 deletion and DMR mutation. FIG.3A is a diagram of the H19 deletion model. Thirteen kb including the H19gene and its DMR in the upstream region were replaced with neo. Whenthis deletion is inherited from the mother, H19 expression is lost andthe normally silent Igf2 allele is activated as shown. Experimentalcrosses were performed between female H19+/− and male Apc+/Min mice toobtain the four genotypes shown. FIG. 3B is a diagram of the H19 DMRmutation model. Three of the four CTCF binding cites at H19 DMR weremutated (closed boxes). When this mutation is inherited from the mother,the normally silent Igf2 allele is activated with H19 expressionmaintained (see also FIGS. 8A-8F). DMR-mutant (142*) female or male micewere crossed with wild type SD7 to obtain mice with LOI and normalimprinting of Igf2, respectively. FIG. 3C is a table of experimentsperformed with each model.

FIGS. 4A and 4B, depict Igf2 mRNA and protein levels. FIG. 4A showsrelative Igf2 mRNA level. Igf2 mRNA levels were analyzed by real-timeRT-PCR, normalized to that of β-actin, and are displayed relative to thesmall intestine of wild type LOI(−) mice at 42 days. Igf2 mRNA was2.0-fold greater in the non-tumor region of LOI(+) mouse intestine thanin LOI(−) mouse intestine at 42 days (P=0.002), and 2.1-fold greater at120 days (P=0.04). For LOI(+) Min mice at 120 days, Igf2 mRNA showed a2.2-fold increase in the non-tumor region (P=0.03) and a 2.3-foldincrease in the tumor region (P=0.003), compared with LOI(−) Min mice.Within a given genotype, the expression of Igf2 did not increase fromnormal to tumor, consistent with an early role for LOI in tumorigenesis.N, non-tumor region. T, tumor region. P values were calculated by T-testfor each comparison. FIG. 4B shows western blot analysis of Igf2protein. Signals were detected at 15 kDa, 17 kDa and weakly at 18 kDausing two separate antibodies (shown), and the intensities wereincreased 1.7-2.1 fold (Upstate) and 1.5-2.1 fold (Abcam) in the smallintestine of LOI(+) mice, normalized to total protein. These highermolecular weight forms are well described in mammals and are moreefficient activators of the Igf1 receptor (the signaling target of Igf2)than is the fully processed form of Igf2.

FIGS. 5A and 5B, depict histomorphology of small intestinal mucosa inLOI(−) mice (FIG. 5A) versus LOI(+) mice (FIG. 5B). Detailedhistopathological exam of the small intestine, colon, andextraintestinal tissues were performed in both 42 day and 120 day(shown) mice. Although no architectural differences are seen inassociation with LOI status, the crypt length of the small intestine ofLOI(+) mice showed a statistically significant increase compared totheir wild-type littermates: 1.2-fold increase (15.3±1.9 μm vs. 13.1±1.8μm, P<0.01) at 42 days; and 1.5-fold increase (19.6±2.0 μm vs. 13.0±2.0μm, P<0.0001) at 120 days.

FIGS. 6A-6D, depict immunohistochemistry for villin and ephrin-B1 in 42day mice. FIG. 6A shows that, in LOI(−) mice, villin is found in acytoplasmic distribution throughout differentiated enterocytes liningintestinal villi, with expression extending to the transition zone andsuperficial crypts. FIG. 6B shows that, in LOI(+) mice, villin islargely restricted to the enterocytes lining intestinal villi with noexpression noted within the transition zone or superficial crypts(indicated by arrow), consistent with a contraction of thedifferentiated cell compartment. As shown in FIG. 6C, ephrin-B1 proteinexpression shows a similar pattern as described for villin, seen ascytoplasmic labeling of differentiated enterocytes lining intestinalvilli in LOI(−) mice. In contrast, FIG. 6D shows that immunostaining forephrin-B1 is markedly decreased in the intestinal villi of LOI(+) mice.

FIGS. 7A-7F, depict immunohistochemistry for musashi1 and twist in 42day mice. FIG. 7A shows that, in LOI(−) mice, musashi1 expression can bedetected within the cytoplasm and nuclei in rare cells within intestinalcrypts (representatively indicated by arrow). FIG. 7B shows that, inLOI(+) mice, musashi1 cytoplasmic and nuclear labeling can be detectedthroughout the intestinal crypts. FIG. 7C shows that, in LOI(−) mice, nomusashi1 expression is detected within the overlying intestinal villi.FIG. 7D shows that, in LOI(+) mice, ectopic cytoplasmic and nuclearexpression is seen in enterocytes lining intestinal villi. FIG. 7E showsweak cytoplasmic twist expression can be detected in rare cells withinintestinal crypts in LOI(−) mice. FIG. 7F shows that twist is greatlyincreased within intestinal crypts of LOI(+) mice.

FIGS. 8A-8F, depict in situ hybridization analysis of Igf2 mRNA levelsin mouse gut with mutation in the H19 DMR (142* mouse). The compositebright- and darkfield images represent: FIG. 8A shows fetal (E16.5) gutin a 142*×SD7 cross, antisense Igf2 riboprobe. FIG. 8B shows fetal gut,142*×SD7 cross, sense probe. FIG. 8C shows adult (153 day) gut, 142*×SD7cross, antisense probe. FIG. 8D shows fetal gut, SD7×142* cross,antisense probe. FIG. 8E shows fetal gut, SD7×142* cross, sense probe.FIG. 8F shows adult gut, SD7×142* cross, antisense probe.

FIGS. 9A-9F, depict in situ hybridization analysis of H19 mRNA levels inE16.5 mouse embryos with mutation in the H19 DMR. FIG. 9A shows abrightfield view over the gut in a 142*×SD7 fetus using antisense Igf2riboprobe. FIG. 9B shows a darkfield view, 142*×SD7 fetus, antisenseprobe. FIG. 9C shows a brightfield view, SD7×142* fetus, antisenseprobe. FIG. 9D shows a darkfield view, SD7×142* fetus, antisense probe.FIG. 9E shows a brightfield view, SD7×142*, sense probe. FIG. 9F shows adarkfield view, SD7×142* fetus, sense probe.

FIGS. 10A-10B depict musashi1 immunostaining of normal colon of acolonoscopy patient without LOI and a patient with LOI. FIG. 10A showsthat musashi1 positive cells were rarely observed in colonic crypts ofpatients without LOI, and there was no surface staining. A higher powerview of the crypt indicated by an asterisk is available in FIG. 2E. Incontrast, FIG. 10B shows that aberrant musashi1 protein expression canbe detected in patients with LOI throughout colonic crypts withextension to surface epithelium (surface indicated by arrow). A higherpower view of the crypt is available in FIG. 2F.

A comparison was made of normal mucosa of patients requiring biopsyduring colonoscopic screening, whose LOI status was previouslydetermined. No morphological differences were noted by conventionalmicroscopy. However, 10 of 11 patients with LOI in the colon showedincreased musashi1 staining extending to the upper half of coloniccrypts and/or surface epithelium, compared with 5 of 15 patients withoutLOI (P=0.004, Fisher exact test) (FIGS. 2E-2F) (FIGS. 10A-10B). Alteredcolon epithelial maturation was also found in all 4 patients with LOIrestricted to the colon (P=0.03), and in 6 of 7 patients with LOI inboth peripheral blood lymphocytes and colon (P=0.03), compared withpatients without LOI.

The sensitivity was reduced but the specificity increased when musashi1staining was combined with a second marker, twist: increased stainingwas seen in 6 of 11 patients with LOI, compared with 1 of 14 patientswithout LOI (P=0.02, Fisher exact test) (FIGS. 2G-2H). While twiststaining alone did not achieve statistical significance (P=0.07), thetwo markers were non-overlapping, suggesting heterogeneity in downstreameffects of LOI.

Cellular mechanisms by which epigenetic alterations in normal cellsaffect cancer risk are discussed herein. The mechanisms effectivelyalter the balance of differentiated and undifferentiated cells. Theepigenetically-mediated shift in normal tissue to a moreundifferentiated state, as described here, may increase the target cellpopulation for subsequent genetic alterations, or may act alone in tumorinitiation. In LOI-mediated Wilms tumor in the rare disorderBeckwith-Wiedemann syndrome (BWS), tumors arise because of an expandedpopulation of nephrogenic precursor cells (Beckwith, et al., Peatr.Pathol. 10, 1 (1990)). Interestingly, we observed pancreatic islet cellhyperplasia, a feature of BWS, in LOI(+) Min mice (data not shown),suggesting that LOI may also predispose to the development of othertumor types. Genetic mechanisms altering cell differentiation and/ordisrupting crypt architecture have been described (Haramis et al.,Science 303, 1684 (2004); van de Wetering et al., Cell 111, 241 (2002);Yang, et al., Cancer Res. 63, 4990 (2003); Velcich et al., Science 295,1726 (2002)), although these mechanisms are not common in normal humantissue.

Mice and Genotyping:

H19 mutant mice with C57BL/6J background carrying a deletion in thestructural H19 gene (3 kb) and 10 kb of 5′ flanking sequence wereobtained. Paternal H19 heterozygotes were maintained without LOIphenotype by breeding female wild-type C57BL/6J and male H19+/−.Experimental crosses were performed between female H19+/− and maleApc+/Min (C57BL/6J). Mice were genotyped as follows using DNA extractedfrom the tails with DNeasy Tissue Kit (Qiagen, Valencia, Calif.). ForH19, PCR was performed using two forward primers and one common reverseprimer to obtain a 847-bp product for wild type allele and a 1,000-bpproduct for mutant allele. Primer sequences and annealing temperatureswere: H19-F, TCC CCT CGC CTA GTC TGG AAG CA (SEQ ID NO:33); Mutant-F,GAA CTG TTC GCC AGG CTC AAG (SEQ ID NO:34); Common-R, ACA GCA GAC AGCAAG GGG AGG GT (SEQ ID NO:35); 66° C. For Apc, PCR and direct sequencingwere performed using the following primers: Apc-F, TTT TGA CGC CAA TCGACA T (SEQ ID NO:36); Apc-R, GGA ACT CGG TGG TAG AAG CA (SEQ ID NO:37);55° C. Mice were sacrificed at 42 days and 120 days for tumorquantitation, histology, and immunostaining, and the entire intestineand other organs were collected. In addition, 150 day old H19 mutantmice carrying knock-in alleles of sequence change from GTGG to ATAT inthree of the four CTCF target sites within H19 imprinting control regionwere established previously and crossed with SD7 mice as described. Wecompared paternally transmitted mutant alleles (non-LOI) to maternallytransmitted alleles (LOI) with immunostaining performed on the sameslide. All the animal experiments were performed in accordance withUniversity guidelines.

Tumor Analysis and Immunostaining:

For analysis of numbers and sizes of tumors, the entire intestine wasflushed with cold PBS and was opened longitudinally. One half was frozenfor further molecular analysis. The other half was fixed with 10%formalin and stained with 0.03% methylene blue, and numbers and sizes oftumors were measured under light microscopy, blinded for genotype.

For histopathological analysis, the entire intestine and other organswere fixed in 4% paraformaldehyde followed by 70% ethanol, and embeddedin paraffin. H&E staining and immunohistochemistry against musashi1(Chemicon, AB5977, 1:200 dilution), twist (Santa Cruz Biotechnology,SC-15393, 1:100 dilution), villin (Chemicon, MAB1671, 1:100 dilution),ephrin-B1 (R&D Systems, AF473, 1:25 dilution) and PCNA (Transduction,P56720 1:200 dilution) were performed comparing 4 mice in each groupover the entire length of the small intestine, to analyze basicmorphology, the balance of undifferentiated to differentiatedcompartments, the proliferation index, and the distribution ofproliferative cells. Crypt length was measured from the base ofintestinal crypts to the base of intestinal villi. Determinations ofcrypt length were blinded to genotype and based on a minimum of 5individual measurements of random, well oriented sections of intestineon each of 2 different histologic sections (10 sections apart), definedas an area with a minimum of three adjacent villi and associated cryptscut perpendicularly to the long axis of the bowel lumen. Mushashilpositive cells were counted using a hemocytometer in 10 individualcrypts per mouse that were perpendicularly oriented to the long axis ofthe intestine. Quantitative image analysis of PCNA labeling wasperformed using the ACIS II automated image analysis system(Chromavision, San Juan Capistrano, Calif.) with measurements of boththe percent and intensity of positive labeling cells determined in 10individual crypts per mouse that were perpendicularly oriented to thelong axis of the intestine. The distribution of proliferative cells wasdetermined using a modification of the method described by Lipkin et al.using a hemocytometer to measure the height of the highest PCNA positivecell within an intestinal crypt divided by the overall height of thatsame crypt, again among 10 individual crypts per mouse. Measurementswere expressed as a ratio, and the mean ratio for LOI(+) and LOI(−) micewas determined. For determinations of apoptotic rate, sections of thesmall intestine were evaluated for the number of positive labellingcells within a total of 20 intestinal crypts per mouse using a TUNELApoptotic Detection Kit (Upstate, Lake Placid, N.Y.).

The normal colonic mucosa of colonoscopy clinic patients with andwithout LOI were analyzed with immunostaining of musashi1 and twist.Musashi1 and twist immunolabeling was evaluated independently andblindly within the bottom half of intestinal crypts, the upper half ofintestinal crypts and surface epithelium. Positive labeling was scoredas nuclear staining with or without cytoplasmic staining in epithelialcells. RNA and protein analysis: Total RNA was extracted from tumor andnon-tumor regions of the frozen intestine using RNeasy Kit with DNase Itreatment (Qiagen), and reverse-transcribed using SuperScript II(Invitrogen, Carlsbad, Calif.). Expression level of Igf2 was quantifiedby real-time RT-PCR using SYBR Green PCR Core Reagents and ABI Prism7700 Sequence Detection System (Applied Biosystems, San Jose, Calif.),and normalized to that of β-actin. Primers and annealing temperaturesare as follows. Igf2: CAT CGT GGA AGA GTG CTG CT (SEQ ID NO:38) and GGGTAT CTG GGG AAG TCG T (SEQ ID NO:39), 60° C. β-actin: TAC CAC CAT GTACCC AGG CA (SEQ ID NO:40) and GGA GGA GCA ATG ATC TTG AT (SEQ ID NO:41),60° C.

Homogenized samples of small intestine of 42 day mice were applied toSDS-polyacrylamide gel (16%) electrophoresis with NuPAGE LDS buffer(Invitrogen) after acidification in 1M acetic acid and lyophilization.Gels were transferred onto Immune-Blot PVDF membrane (BioRad, Hercules,Calif.), and the membranes were blocked with blocking buffer (5% non-fatdried milk, 0.1% Tween-20 in TBS) at 4° C. overnight, incubated with a1:500 dilution of Igf2 antibody (Upstate, Lake Placid, N.Y.) or a 1:1000dilution of Igf2 antibody (Abcam, Cambridge, Mass.) at room temperaturefor 1 h. After treatment with HRP conjugated secondary antibody and ECLdetection reagents (Amersham, Piscataway, N.J.), and exposure to X-rayfilm, signal intensities were measured with a scanning densitometer. Thegels were stained with SimplyBlue SafeStain (Invitrogen), and theintensities of the staining were measured with a scanning densitometerto correct the signal intensities.

TABLE 1 Increased adenoma number and surface area in LOI(+) Min mice.Displayed are the adenoma counts, as well as counts corrected forintestinal surface area alone, or for both intestinal and adenomasurface area. Mean ± standard error (SE); P value was calculated byt-test. Fold Small Fold increase; increase; Genotype N intestine P valueColon P value Number of adenomas LOI(−) Min 81 27.7 ± 1.3 2.2; 1.3 ± 0.12.2; LOI(+) Min 59 60.4 ± 3.7 <0.00001 2.9 ± 0.3 <0.0001 Surface area ofadenomas (% of intestine occupied by adenomas) LOI(−) Min 81  2.2 ± 0.12.4; 2.3 ± 0.3 2.5; LOI(+) Min 59  5.5 ± 0.4 <0.00001 5.8 ± 0.9 <0.001Number of adenomas/10 cm² of intestine LOI(−) Min 81 10.8 ± 0.5 1.8; 3.7± 0.5 1.9; LOI(+) Min 59 19.2 ± 1.1 <0.00001 7.0 ± 0.8 <0.0001

TABLE 2 Semi-quantitative analysis of musashil staining in intestinalcrypts. The number of Mushashil-positive cells was analyzed in LOI(−)Min mice and LOI(+) Min mice, and the number of crypts containing ≧6 and<6 Musashil positive cells is shown. P value was calculated by Fisherexact test. Number of crypts Genotypes ≧6 musashil (+) cells <6 musashil(+) cells P value LOI(−) Min 5 35 <0.01 LOI(+) Min 17 23

Other Embodiments

It is to be understood that while the invention has been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinvention, which is defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

What is claimed is:
 1. A method of determining predisposition of asubject to developing a neoplastic or cell proliferation disordercomprising determining the ratio of undifferentiated to differentiatedcells in the same or different sample from the subject, wherein theratio of undifferentiated to differentiated cells, as compared to areference ratio, is indicative of a predisposition for developing aneoplastic or cell proliferation disorder.
 2. The method of claim 1,further comprising identifying a cell displaying abnormal expression ofa target gene that directly or indirectly results from loss ofimprinting.
 3. The method of claim 2, wherein the target gene isselected from H19 or IGF2.
 4. The method of claim 2, wherein the targetgene is selected from the group consisting of Igf1R, IRS-1, IRS-2, PI3K,Akt, p70S6 kinase, FOXO, GSK3, MDM2, mTOR, Cyclin D1, c-Myc, Shc, Grb2,SOS, Ras, Raf, MEK, Erk, and MAPK gene.
 5. The method of claim 2,wherein the method comprises analyzing the biological sample for achange in the methylation status of a target gene, or a polymorphismthereof.
 6. The method of claim 5, wherein the change in methylation ishypomethylation.
 7. The method of claim 5, wherein the method comprisesanalyzing the biological sample for hypomethylation of both a DMR of theH19 gene and a DMR of the IGF2 gene.
 8. The method of claim 3, whereinthe reference ratio is generated from tissue obtained from a subjectcomprising cells displaying normal imprinting of at least one of the H19gene and the IGF2 gene.
 9. The method of claim 1, wherein determining achange in the balance or ratio of undifferentiated to differentiatedcells in the sample comprises identifying a biomarker associated with adifferentiated or undifferentiated cell.
 10. The method of claim 9,wherein the biomarker is selected from the group consisting of Shh(Sonic hedgehog), Tcf4, Lef1, Twist, EphB2, EphB3, Hes1, Notch1, Hoxa9,Dkk1, Tle6, Tcf3, Bmi1, Kit, Musashi1 (Msi1), Cdx1, Hes5, Oct4, Ki-67,□-catenin, Noggin, BMP4, PTEN (phosphorylated PTEN), Akt (phosphorylatedAkt), Villin, Aminopeptidase N (anpep), Sucrase isomaltase (SI),Ephrin-B1 (EfnB1), Cdx2, Crip, Apoa1, Aldh1b1, Calb3, Dgat1, Dgat2, Clu,Hephaestin, Gas1, Ihh (Indian hedgehog), Intrinsic factor B12 receptor,IFABP, and KLF4.
 11. The method of claim 1, wherein determining theratio of undifferentiated to differentiated cells in the samplecomprises: a) imaging the sample using immunohistochemicalidentification of biomarker molecules specifically associated with adifferentiated or undifferentiated cell population; b) imaging thesample using standard microscopy and distinguishing differentiated fromundifferentiated cells using morphologic measurements; c) imaging thesample using immunohistochemical identification of proliferationantigens and their distribution within colonic crypts; d) imaging thesample using immunoflourescent identification of molecules specific to abiomarker associated with a differentiated or undifferentiated cellpopulation; e) measuring RNA levels; f) measuring gene expression; g)whole genome expression analyses; or h) allele specific expression. 12.The method of claim 1, wherein the cells are epithelial cells.
 13. Themethod of claim 12, wherein the epithelial cells are obtained from arectal Pap test.
 14. The method of claim 12, wherein the epithelialcells are obtained from intestinal tissue.
 15. The method of claim 14,wherein the intestinal tissue is obtained from the colon.
 16. The methodof claim 14, wherein the epithelial cells are obtained from the lumen ofthe intestinal tissue.
 17. The method of claim 16, wherein theepithelial cells are obtained from the crypts of the lumen.
 18. Themethod of claim 1, wherein the cell proliferation or neoplastic disorderis associated with a solid tumor.
 19. The method of claim 18, whereinthe solid tumor is an adenoma.
 20. The method of claim 19, wherein theadenoma is colorectal cancer.
 21. The method of claim 20 wherein thesubject is not previously known to have a colorectal neoplasm.
 22. Themethod of claim 1, further comprising correlating the ratio derived fromthe subject with the subject's family genetic history.
 23. The method ofclaim 1, wherein the subject is subjected to additional tests selectedfrom the group consisting of chest X-rays, colorectal examinations,endoscopic examination, MRI, CAT scanning, gallium scanning, and bariumimaging.
 24. The method of claim 1, wherein the subject is a human. 25.The method of claim 1, wherein the cancer is pancreatic cancer.